Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
# Welcome Welcome to the project homepage of **(*an:a*)-lyzer**, an interactive web application for exploring the historical increase in onset strength in the Google Books Ngrams data. In this repository you can find the source code of the web application, [a link to a live version][1] as well as a downloadable standalone version. The following publication contains a detailed documentation of the application and its functions as well as an exemplary analysis of the re-emergence of the /h/ onset from Early Modern to Present-Day English: [Schlüter, Julia & Vetter, Fabian (2020) "An interactive visualization of Google Books Ngrams with R and Shiny: Exploring a(n) historical increase in onset strength in a(n) huge database." In *Journal of Data Mining and Digital Humanities*. Special issue: *Visualizations in Historical Linguistics* (ed. by Benjamin Molineaux, Bettelou Los and Martti Mäkinen). 25 pages.][6] [https://jdmdh.episciences.org/7000][4], [https://hal.archives-ouvertes.fr/hal-02149498v4][5] **Abstract**: Using the re-emergence of the /h/ onset from Early Modern to Present-Day English as a case study, we illustrate the making and the functions of a purpose-built web application named (an:a)-lyzer for the interactive visualization of the raw n-gram data provided by Google Books Ngrams (GBN). The database has been compiled from the full text of over 4.5 million books in English, totalling over 468 billion words and covering roughly five centuries. We focus on bigrams consisting of words beginning with graphic <h> preceded by the indefinite article allomorphs *a* and *an*, which serve as a diagnostic of the consonantal strength of the initial /h/. The sheer size of this database affords us the possibility to attain a maximal diachronic resolution, to distinguish highly specific groups of <h>-initial lexical items, and even to trace the diffusion of the observed changes across individual lexical units. The functions programmed into the app enable us to explore the data interactively by filtering, selecting and viewing them according to various parameters that were manually annotated into the data frame. We also discuss limitations of the database, of the app and of the explorative data analysis. **Keywords**: Data visualization, corpus linguistics, historical phonology, historical linguistics, R, Shiny, n-grams, Google Books, Google Books Ngrams ### Related publications [Schlüter, Julia (2019) "Tracing the (re-)emergence of /h/ and /j/ through 350 years of books: Mergers and merger reversals at the interface of phonetics and phonology." *Folia Linguistica* 40.1 (2019), Special issue: *Diachronic phonotactics* (ed. by Nikolaus Ritt, Andreas Baumann and Christina Prömer). 177–202.][7] DOI: https://doi.org/10.1515/flih-2019-0009. URN: [https://fis.uni-bamberg.de/handle/uniba/49399.][7] **Abstract**: This paper investigates the (re-)emergence of onset consonants in English loans from French, Latin and Greek, spelt with initial <u> (> /juː/; e.g. *union*, *use*), initial <eu> (> /juː/; e.g. *eulogy*, *euphemism*), or initial <h> (e.g. *habit*, *homogeneous*). It analyses Google Books data, exploiting the occurrence of the article allomorph *a* (rather than *an*) as a diagnostic of consonantal realisation. The analysis yields a fine-grained description of the (re-)emergence of consonantal onsets. It shows that their emergence has been a gradual process and has not reached completion yet. On a theoretical level, the paper discusses the interaction between categorical phonological processing and fine-grained phonetic distinctions in an exemplar-based framework. It also sheds light on the question of (near-)mergers and their potential reversibility. **Keywords**: *h*-dropping, glide formation, filled-onset constraint, unmerging of (near-)mergers, categorical perception ## Accessing (*an:a*)-lyzer ### Live version [__Start live version (only <h>-initial lemmas)__][1] [__Start live version (all data)__][2] For more details as well as a usage guide, please refer to the documentation. ### Standalone version The standalone version of (*an:a*)-lyzer includes the source code of the app, a portable version of R where all required packages are installed and a wrapper so that the app can be started easily. The standalone version is based on W. Lee Pang's code for creating shiny desktop applications (https://github.com/wleepang/DesktopDeployR). You can download the [__most recent version of the app by clicking this link__][3]. To start the app, simply extract the contents of the ZIP-file and double-click the file called "(Re-)Emerging_Onset_Consonants.bat". The app will then open in a new browser window. ### Source code The source code is also available in the OSF repository and is intended for users who are already familiar with R and wish to run the app with their own installation of R or make changes to the program. As updated packages and dependencies between packages can cause errors, we have attached the output of the function _sessionInfo()_ with which the current version of the app was developed: R version 3.6.0 (2019-04-26) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 14393) Matrix products: default locale: \[1\] LC\_COLLATE=German\_Germany.1252 LC\_CTYPE=German\_Germany.1252 LC\_MONETARY=German\_Germany.1252 LC\_NUMERIC=C \[5\] LC\_TIME=German\_Germany.1252 attached base packages: \[1\] stats graphics grDevices utils datasets methods base other attached packages: \[1\] DT\_0.6 shinycssloaders\_0.2.0 purrr\_0.3.2 shinyjqui\_0.3.2 shinydashboard\_0.7.1 shinyjs\_1.0 shinyTree\_0.2.6 \[8\] shinyBS\_0.61 plotly\_4.9.0 scales\_1.0.0 splitstackshape\_1.4.8 dplyr\_0.8.1 ggplot2\_3.1.1 data.table\_1.12.2 \[15\] Rmisc\_1.5 plyr\_1.8.4 lattice\_0.20-38 V8\_2.2 shiny\_1.3.2 loaded via a namespace (and not attached): \[1\] Rcpp\_1.0.1 compiler\_3.6.0 pillar\_1.4.1 later\_0.8.0 tools\_3.6.0 digest\_0.6.19 viridisLite\_0.3.0 jsonlite\_1.6 \[9\] tibble\_2.1.2 gtable\_0.3.0 pkgconfig\_2.0.2 rlang\_0.3.4 rstudioapi\_0.10 crosstalk\_1.0.0 yaml\_2.2.0 curl\_3.3 \[17\] httr\_1.4.0 withr\_2.1.2 htmlwidgets\_1.3 grid\_3.6.0 tidyselect\_0.2.5 glue\_1.3.1 R6\_2.4.0 tidyr\_0.8.3 \[25\] magrittr\_1.5 promises\_1.0.1 htmltools\_0.3.6 assertthat\_0.2.1 mime\_0.6 xtable\_1.8-4 colorspace\_1.4-1 httpuv\_1.5.1 [1]: https://eng-ling.uni-bamberg.de/shiny/onset_ngrams?data=edi [2]: https://eng-ling.uni-bamberg.de/shiny/onset_ngrams [3]: https://osf.io/9py2v/download [4]: https://jdmdh.episciences.org/7000 [5]: https://hal.archives-ouvertes.fr/hal-02149498v4 [6]: https://fis.uni-bamberg.de/handle/uniba/49395 [7]: https://fis.uni-bamberg.de/handle/uniba/49399
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.