R code underlying the visual and statistical analyses in the paper.

Abstract Words and phrases may differ in the extent to which they are susceptible to prosodic foregrounding and expressive morphology: their expressiveness. They may also differ in the degree to which they are integrated in the morphosyntactic structure of the utterance: their grammatical integration. We describe an inverse relation that holds across widely varied languages, such that more expressiveness goes together with less grammatical integration, and vice versa. We review typological evidence for this inverse relation in 10 spoken languages, then quantify and explain it using Japanese corpus data. We do this by tracking ideophones —vivid sensory words also known as mimetics or expressives— across different morphosyntactic contexts and measuring their expressiveness in terms of intonation, phonation and expressive morphology. We find that as expressiveness increases, grammatical integration decreases. Using gesture as a measure independent of the speech signal, we find that the most expressive ideophones are most likely to come together with iconic gestures. We argue that the ultimate cause is the encounter of two distinct and partly incommensurable modes of representation: the gradient, iconic, depictive system represented by ideophones and iconic gestures and the discrete, arbitrary, descriptive system represented by ordinary words. The study shows how people combine modes of representation in speech and demonstrates the value of integrating description and depiction into the scientific vision of language.

Metadata Dingemanse, Mark & Akita, Kimi. 2016. An inverse relation between expressiveness and grammatical integration: on the morphosyntactic typology of ideophones, with special reference to Japanese. Journal of Linguistics.

Preliminaries

Loads required packages (installing them if needed), records version numbers, and sets a ggplot theme.

rm(list=ls()) # clear workspace

# Load packages (installing them if needed)
list.of.packages <- c("rmarkdown","ggplot2","ggthemes","scales","dplyr","tidyr","lme4","Hmisc","extrafont")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
sapply(list.of.packages, suppressPackageStartupMessages(require), warn.conflicts=F,quietly=T, character.only=T)
## Warning: package 'ggplot2' was built under R version 3.2.5
## Warning: package 'ggthemes' was built under R version 3.2.4
## Warning: package 'dplyr' was built under R version 3.2.5
## 
## Attaching package: 'Matrix'
## The following object is masked from 'package:tidyr':
## 
##     expand
## Registering fonts with R

Version information for R base and packages used:

R.version.string
## [1] "R version 3.2.3 (2015-12-10)"
versions <- lapply(list.of.packages,packageVersion)
names(versions) <- list.of.packages
as.data.frame(versions)
##   rmarkdown ggplot2 ggthemes scales dplyr tidyr   lme4  Hmisc extrafont
## 1     0.9.5   2.1.0    3.0.3  0.4.0 0.5.0 0.4.1 1.1.11 3.17.2      0.17
rm(list.of.packages,new.packages,versions)

Data

Loads the full dataset d and creates a copy d.p that contains only data relevant to the analysis based on the predicate integration hierarchy.

load(file="ExpInt_open.RData")

predicates <- c("Quotative","Collocational","Predicative")
d.p <- droplevels(d[which(d$syn %in% predicates),])

str(d) # show structure of data
## 'data.frame':    692 obs. of  13 variables:
##  $ uid         : chr  "9520_1" "3497_1" "9479_1" "3509_1" ...
##  $ idph_romaji : chr  "zuQ-to" "guraguragura-to" "dadadadaQ-te" "hitahitahitahita-to" ...
##  $ utt         : Factor w/ 505 levels "19","25","30",..: 459 120 455 122 322 204 268 291 371 482 ...
##  $ int         : Factor w/ 185 levels "1","2","3","4",..: 168 56 166 56 124 84 108 113 137 178 ...
##  $ morphosyntax: Factor w/ 13 levels "c","h","na","nc",..: 7 7 7 9 7 7 7 13 7 7 ...
##  $ intn        : num  0 0 1 0 0 0 1 1 1 0 ...
##  $ phon        : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ exmo        : num  0 1 1 1 0 1 1 1 0 1 ...
##  $ exp         : logi  FALSE TRUE TRUE TRUE FALSE TRUE ...
##  $ gest        : num  0 0 0 1 0 1 0 0 1 0 ...
##  $ exp_cum     : num  0 1 2 1 0 1 2 2 1 1 ...
##  $ syn         : Ord.factor w/ 4 levels "Quotative"<"Collocational"<..: 1 1 1 4 1 1 1 NA 1 1 ...
##  $ optionality : Factor w/ 3 levels "0","1","NA": 2 2 2 2 2 2 2 3 2 2 ...

The data consist of ideophone tokens (idph_romaji) as they occur in a corpus of Japanese narratives, coded for various measures of expressiveness and morphosyntactic integration. Every ideophone token has a unique identifier uid for cross-referencing purposes. The corpus consists of interviews int (each with a different narrator) which in turn consist of utterances utt, each with 1 or more ideophones.

The two most important variables are cumulative expressiveness (exp_cum) and morphsyntactic integration (syn). We are interested in the relation between them. In particular, our theoretical account predicts that ideophones with higher expressiveness will be found in constructions with lower integration: an inverse relation between expressiveness and grammatical integration. So in our analysis, expressiveness is the predictor (or independent variable) and morphosyntactic integration the outcome (or dependent variable).

exp_cum (predictor variable) is an ordinal variable with four levels: 0 < 1 < 2 < 3. It is computed by summing three distinct, logically independent binary ordinal variables: intonational foregrounding (intn 1/0), phonational foregrounding (phon 1/0), and expressive morphology (exmo 1/0).

syn (dependent variable) is an ordered factor with three values ordered Quotative > Collocational > Predicative. It applies to all ideophones in the corpus that are part of a predicate (which is 90% of corpus tokens). These three construction types are derived from a more fine-grained analysis of the morphosyntactic contexts of ideophones, recorded in morphosyntax.

An alternative, closely related measure of morphosyntactic integration is optionality: the syntactic optionality of the ideophone. It has the virtue of applying to all ideophone tokens in the corpus, but the disadvantage of providing only a binary measure of syntactic integration. It partly overlaps with syn (as optionality is a defining features of some of the construction types), so should not be used as a measure independent of it.

Finally, we have collected a measure that is in principle independent of the speech signal: the co-occurence of iconic manual gesture with the ideophone. It is a binary ordinal variable gest (1/0).

Descriptive stats and plots

We start with some descriptive stats.

d %>%
  group_by(optionality) %>%
  summarise(count=n())
## # A tibble: 3 x 2
##   optionality count
##        <fctr> <int>
## 1           0    83
## 2           1   596
## 3          NA    13
d.p %>%
  group_by(syn) %>%
  summarise(count=n())
## # A tibble: 3 x 2
##             syn count
##           <ord> <int>
## 1     Quotative   389
## 2 Collocational   155
## 3   Predicative    81

For plotting, we set a theme to make all plots consistent and printable.

# Define ggplot2 theme
windowsFonts(GillSans=windowsFont("Gill Sans MT"))
theme_md_bw <- function (ticks=TRUE, base_family="GillSans", base_size=18) {
  ret <- theme_bw(base_family=base_family, base_size=base_size) +
    theme(
      axis.title.x = element_text(vjust=0),
      axis.title.y = element_text(vjust=0),
      plot.margin = unit(c(1, 1, 0.5, 0.5), "lines"),
      panel.border = element_rect(fill = NA,colour = NA),
      legend.key = element_blank(),
      strip.background = element_blank()
    )
  if (!ticks) {
    ret <- ret + theme(axis.ticks = element_blank())
  }
  ret
}

Now let’s see the frequencies of expressive features by construction.

# Tidy data for plotting frequencies
freqs <- d.p %>%
  select(uid,syn,intn,phon,exmo) %>%
  group_by(syn,intn,phon,exmo) %>%
  gather(feature,"present",3:5) %>%
  ungroup() %>%
  group_by(syn,feature,present) %>%
  summarise(count=n())

freqs$present <- as.factor(freqs$present)
levels(freqs$present) <- c("absent","present")
freqs$present <- factor(freqs$present,levels(freqs$present)[c(2,1)], ordered=F)
freqs$feature <- as.factor(freqs$feature)
freqs$feature <- factor(freqs$feature, levels(freqs$feature)[c(2,1,3)], ordered=F)

p <- ggplot(arrange(freqs,present), aes(x=feature, y=count, fill=present)) + 
  geom_bar(stat="identity", position="fill") +
  xlab("") +
  ylab("")
p + facet_wrap(~ syn) + # create faceted plot
  scale_fill_manual(values = c("grey40","grey80")) +
  scale_y_continuous(labels=percent) +
  theme_md_bw() + # further theme options
  theme(axis.text.x=element_text(angle=45, vjust=1.4,hjust=1.1),axis.ticks.x=element_blank()) +
  scale_x_discrete(labels=c("intonation","expr-morph.","phonation")) +
  theme(plot.margin=unit(c(0,0,0.6,0),"cm"),legend.margin=unit(c(-0.4,0,0,0),"cm"),legend.title=element_blank(),legend.position="bottom",panel.grid.minor=element_blank(),panel.grid.major=element_blank()) + 
  theme(strip.text = element_text(size=20)) # sets size of face labels

ggsave(file="Fig_5_expint_syn_by_exp.png", width=9,height=7)
ggsave(file="Fig_5_expint_syn_by_exp.pdf", width=9,height=7,device=cairo_pdf)

Figure 5: Distribution of expressive features over syntactic constructions

Proportionally, the occurrence of expressive features is more likely in the Quotative and Collocational construction types. Intonational foregrounding is most frequent, but even phonational foregrounding shows the same skew.

Figure 6: Cumulative expressiveness by predicate integration

This figure plots every single ideophone token in the corpus as a function of its cumulative expressiveness and its grammatical integration. The more expressive an ideophone, the more likely it is to occur in a freer construction, and vice versa.

Figure 6: Gesture by cumulative expressiveness & integration

This is essentially Figure 2, but now showing whether ideophone tokens co-occur with an iconic gesture (filled circles) or not (empty circles). The more expressive and free an ideophone, the more likely it is to come with a gesture.

Correlations

Although the distributional evidence as shown in the Figures is already telling, it is useful to also do the argument by numbers.

We start by looking at correlations between the individual expressive features and morphosyntactic integration. For correlating binary ordinal variables, most people advise a Pearson correlation.

selection <- c("intn","exmo","phon","syn","gest","exp","exp_cum") # select relevant columns for subsequent analyses
d.sel <- as.data.frame(sapply(d.p[,unlist(selection)], as.numeric)) # convert to long form, numeric

correlations.pearson <- rcorr(as.matrix(d.sel[,c("intn","phon","exmo","syn")]), type="pearson")
correlations.pearson$r
##            intn       phon       exmo        syn
## intn  1.0000000  0.2241448  0.4638629 -0.3192728
## phon  0.2241448  1.0000000  0.3568144 -0.3032321
## exmo  0.4638629  0.3568144  1.0000000 -0.3889349
## syn  -0.3192728 -0.3032321 -0.3889349  1.0000000
# Correcting for multiple comparisons
p <- correlations.pearson$P
p <- c(p[2:4,1],p[3:4,2])
p.adjust(p, method = "bonferroni")
##         phon         exmo          syn         exmo          syn 
## 7.367020e-08 0.000000e+00 2.220446e-15 0.000000e+00 4.662937e-14

Pearson Product-Moment correlations show that values of intn, exmo, and phon are positively correlated with each other and negatively correlated with syn, all ps < 0.0001 when correcting for multiple comparisons using the conservative Bonferroni method.

As measures of expressiveness are logically independent yet correlated with each other, we sum them in a cumulative measure exp_cum and test for a correlation with syn. For correlating two ordinal variables with each other (here, exp_cum 0<1<2<3 and syn Q < C < P), Pearson’s r is less suited, as we don’t know whether the distance between the categories is the same at every scale point. In this case it is better to use Spearman’s ρ. What Spearman does is calculate a correlation coefficient on rankings rather than on the actual data.

correlation.expint <- rcorr(as.matrix(d.sel[,c("exp_cum","syn")]), type="spearman")
cat("rho =",correlation.expint$r[2],", p =",correlation.expint$P[2],", n =",correlation.expint$n[2])
## rho = -0.4489127 , p = 0 , n = 625
# By the way, the significance is not different if we do a pearson correlation
# rcorr(as.matrix(d.sel[,c("exp_cum","syn")]), type="pearson")

A Spearman correlation coefficient was computed to assess the relationship between expressiveness and morphosyntactic integration. There was a correlation between the two variables (ρ = -0.45, p < 0.0001, n = 625. Increases in expressiveness were correlated with decreases in morphosyntactic integration.

Using optionality as a measure of integration

So far we have analysed integration as measured by the predicate integration hierarchy (applying to 90% of tokens in the corpus). One reviewer pointed out it might be useful to consider using one measure across the board. Therefore we also use a closely related measure of syntactic optionality, which applies to all tokens in the corpus.

selection <- c("exp_cum","optionality")
d.sel <- as.data.frame(sapply(d[,unlist(selection)], as.numeric)) # convert to long form, numeric

correlation.opt <- rcorr(as.matrix(d.sel), type="spearman")
cat("rho =",correlation.opt$r[2],", p =",correlation.opt$P[2],", n =",correlation.opt$n[2])
## rho = 0.2751873 , p = 1.723066e-13 , n = 692

A Spearman correlation coefficient was computed to assess the relationship between expressiveness and syntactic optionality. There was a strong correlation between the two variables (ρ = 0.28, p < 0.0001, n = 692). Higher cumulative expressiveness is positively correlated with syntactic optionality.

Gesture

Our data features iconic gesture as a measure independent of the speech signal. We can use the gesture data to test our proposal that the expressiveness and freedom of ideophones are ultimately due to their depictive nature. If this is the case, we would expect gestures to correlate positively with exp_cum and negatively with syn.

selection <- c("exp_cum","syn","gest") # select relevant columns for subsequent analyses
d.sel <- as.data.frame(sapply(d.p[,unlist(selection)], as.numeric)) # convert to long form, numeric

correlation.gest <- rcorr(as.matrix(d.sel[,c("exp_cum","syn","gest")]), type="pearson")
correlation.gest$r
##            exp_cum        syn       gest
## exp_cum  1.0000000 -0.4486305  0.5406108
## syn     -0.4486305  1.0000000 -0.2758672
## gest     0.5406108 -0.2758672  1.0000000
# Correcting for multiple comparisons
p <- correlation.gest$P
p <- p[3,1:2]
p.adjust(p, method = "bonferroni")
##      exp_cum          syn 
## 0.000000e+00 9.652217e-10

Pearson Product-Moment correlations show that values of exp_cum and gest are positively correlated with each other (r 0.54, n = 492, p < 0.0001), and that gest is negatively correlated with syn (r -0.45, n = 492,p < 0.0001), all p’s Bonferroni-corrected.

Linear mixed effects modelling

Beyond our predictor exp_cum, which according to our theory predicts syn, there are dependencies in the data. For instance, ideophones are nested within utterances, which are nested within interviews.

To take these dependencies into account, we use mixed effects modelling with the lme4 package. As above, the dependent variable is syn and the main predictor (fixed effect) is exp_cum; however, we also allow for random effects of utterance (utt) and we allow expressiveness to vary by random slopes by int interview, a proxy for individual differences in expressive speech. (We don’t include random slopes by utt as we have no reason to believe that expressiveness is changing from utterance to utterance; also, the number of observations would then be less than the number of random effects so the model would be uninterpretable).

d.sel <- d.p
d.sel$syn <- as.numeric(d.sel$syn) # convert to numeric for lme4

expint.null = lmer(syn ~ (1|utt) + (1+exp_cum|int), data=d.sel)

expint.model = lmer(syn ~ exp_cum + (1|utt) + (1+exp_cum|int), data=d.sel)

anova(expint.null,expint.model)
## refitting model(s) with ML (instead of REML)
## Data: d.sel
## Models:
## expint.null: syn ~ (1 | utt) + (1 + exp_cum | int)
## expint.model: syn ~ exp_cum + (1 | utt) + (1 + exp_cum | int)
##              Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
## expint.null   6 1290.8 1317.5 -639.41   1278.8                         
## expint.model  7 1201.5 1232.6 -593.75   1187.5 91.332      1  < 2.2e-16
##                 
## expint.null     
## expint.model ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We constructed a model expint.null with no fixed effect and random effects for utterance and narrative, and a model expint.model with cumulative expressiveness as fixed effect (and the same random effects structure). Model comparison shows that expressiveness affected morphosyntactic integration (χ2(1)=91.33, p<0.000001).

We do the same for optionality, the alternative binary measure of integration. Here we work with d because optionality data is available for all tokens in the corpus.

selection <- c("syn","optionality")
d.sel <- d 
d.sel[,selection] <- lapply(d.sel[,selection], as.numeric)

expint.optnull = lmer(optionality ~ (1|utt) + (1+exp_cum|int), data=d.sel)

expint.optexp = lmer(optionality ~ exp_cum + (1|utt) + (1+exp_cum|int), data=d.sel)

anova(expint.optnull,expint.optexp)
## refitting model(s) with ML (instead of REML)
## Data: d.sel
## Models:
## expint.optnull: optionality ~ (1 | utt) + (1 + exp_cum | int)
## expint.optexp: optionality ~ exp_cum + (1 | utt) + (1 + exp_cum | int)
##                Df    AIC    BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)
## expint.optnull  6 496.15 523.38 -242.07   484.15                         
## expint.optexp   7 465.84 497.62 -225.92   451.84 32.305      1  1.318e-08
##                   
## expint.optnull    
## expint.optexp  ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
selection <- c("syn")
d.sel <- d 
d.sel[,selection] <- as.numeric(d.sel[,selection])

# By the way since opt is binary we could use glmer with a binomial model; however, then the main model won't converge ("degenerate Hessian with 1 negative eigenvalues")
#expint.optnull.g = glmer(optionality ~ (1|utt) + (1+exp_cum|int), data=d.sel,family="binomial")
#expint.optexp.g = glmer(optionality ~ exp_cum + (1|utt) + (1+exp_cum|int), data=d.sel,family="binomial")

We constructed a model expint.optnull to predict optionality with random effects for utterance and narrative, but no fixed effect; and a model expint.optexp with cumulative expressiveness as fixed effect (and the same random effects structure). The full model converged with an AIC of 465.8, BIC 497.6 and log likelihood -225.9, and has a significantly better fit than the null model (log likelihood difference 16.15). Model comparison shows that expressiveness affected syntactic optionality (χ2(1) = 32.305, p < 0.0001).

What about gesture?

Pearson correlations above show that gesture and cumulative expressiveness are highly correlated with each other (r = 0.54, p < 0.0001, n = 492). This means it would be inappropriate to add gest as a predictor variable in the model besides exp_cum, as this would contradict the non-collinearity assumption of lme.

By way of a sanity check, we can still see how gest fares as a predictor variable without exp_cum, using the same random effects structure as above to account for imbalances and dependencies in the data:

d.sel <- d.p[,c("utt","syn","exp_cum","int","gest")]
d.sel$syn <- as.numeric(d.sel$syn)
d.sel <- d.sel[complete.cases(d.sel),]

expint.null = lmer(syn ~ (1|utt) + (1+exp_cum|int), data=d.sel)

expint.gest = lmer(syn ~ gest + (1|utt) + (1+exp_cum|int), data=d.sel)

anova(expint.null,expint.gest)
## refitting model(s) with ML (instead of REML)
## Data: d.sel
## Models:
## expint.null: syn ~ (1 | utt) + (1 + exp_cum | int)
## expint.gest: syn ~ gest + (1 | utt) + (1 + exp_cum | int)
##             Df     AIC    BIC  logLik deviance Chisq Chi Df Pr(>Chisq)    
## expint.null  6 1010.48 1035.7 -499.24   998.48                            
## expint.gest  7  994.57 1024.0 -490.28   980.57 17.91      1  2.316e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The model including gesture as a fixed effect converges with an AIC of 994.6, BIC 1024.0 and log likelihood of -490.3. It has a significantly better fit than a model without gesture (χ2(1) = 17.91, p < 0.0001).

Replication

To replicate the analyses presented above, you can simply copy the following chunk of code to the most recent version of R. You’ll need the packages specified under ‘Required libraries’.

# Download the analysis file
download.file('SOURCE_URL','analysis.Rmd')
# Render R markdown analysis file and compute results, then show them
render('analysis.Rmd') 
browseURL(paste('file://', file.path(getwd(),'analysis.html'), sep='')) # shows result