Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
# Data set of ditransitive alignment patterns This repository is a data set of case and agreement alignment patterns in over 120 languages. ## Data The data can be found in `languages.csv` in which each line represents one data point, a ditransitive or transitive sentence. The CSV file provides the language, source, alignment type, gloss, as well as the case and agreement properties of the arguments for each data point. ### Structure `languages.csv` is a CSV file (using `;` as a separator) with 24 columns. - `Language`: Name of language - `Original`: Original text of examples in source language - `Gloss`: Gloss (sometimes adapted) - `Translation`: Translation from source - `Type`: Case and agreement alignment for ditransitives, Transitive, Intransitive, or Recipient passive - `Remarks`: Additional remarks - `Source`: Source of example in plain text - `Citekey`: biblatex citekey (see [sources.bib](sources.bib)) - `Page`: Page number containing example in source - `Exno`: Example number in source - `AddSourceInfo`: Additional source information if available - `Predicate`: Predicate in source language, e.g. *give* - `SbjPrs`: Person of subject - `SbjNbr`: Number of subject - `SbjGen`: Gender of subject - `SbjCase`: Case of subject - `RPrs`: Person of recipient - `RNbr`: Number of recipient - `RGen`: Gender of recipient - `RCase`: Case of recipient - `TPrs`: Person of theme/patient - `TNbr`: Number of theme/patient - `TGen`: Gender of theme/patient - `TCase`: Case of theme/patient In the case of passives, `Type` is `Recipient passive` or `Theme passive`, the features of the recipient and the theme are specified, and `SbjPrs`, `SbjNbr`, `SbjGen`, `SbjCase` are left empty. ### Printing data to the console The script `data.py` provides a few simple ways of accessing the data. The script needs the name of a language in the database as an argument (and optionally takes an example number). For example, running `./data.py gorwaa` prints the two examples from Gorwaa found in the data set. ``` > ./data.py gorwaa Example (2), Gorwaa, ICIA (Harvey 2018: 176) mwalimú kitaabú ng-u-∅-(g)a hariís dír desír teacher.LMo book.LMo A.3-P.M-AUX-PRF bring.M.PST to girl.LFR The teacher brought the book to the girl. Example (3), Gorwaa, SCSA (Harvey 2018: 176) mwalimú desír ng-a-∅-na kitaabú-i hariís teacher.LMo girl.LFR A.3-P.F-AUX-IMPRF book.LMo-LAT bring.M.PST The teacher brought the girl the book. ``` Optionally, at least one example number can be provided as an argument. The script then prints a LaTeX-formatted version of the example(s), using the syntax of the [ExPeX](https://ctan.org/pkg/expex) package for examples and the [leipzig](https://ctan.org/pkg/leipzig?lang=en) package for glosses. Non-standard abbreviations follow the original authors' usage and might not be available for `leipzig`. ``` > ./data.py gorwaa 3 \ex\label{ex:gorwaa-3} \begingl \glpreamble Gorwaa, SCSA, \parencite[176]{Harvey2018}// \gla mwalimú desír ng-a-∅-na kitaabú-i hariís// \glb teacher.\Lmo{} girl.\Lfr{} \A{}.\Third-\P{}.\F{}-\Aux{}-\Imprf{} book.\Lmo{}-\Lat{} bring.\M{}.\Pst{}// \glft `The teacher brought the girl the book.'// \endgl \xe ``` Instead of using glossing commands, by adding the option `--style smallcaps`, the output uses the LaTeX command `\textsc{}` to format grammatical morphemes in small capitals: ``` > ./data.py gorwaa 3 --style smallcaps \ex\label{ex:gorwaa-3} \begingl \glpreamble Gorwaa, SCSA, \parencite[176]{Harvey2018}// \gla mwalimú desír ng-a-∅-na kitaabú-i hariís// \glb teacher.\textsc{lmo} girl.\textsc{lfr} A.3-\textsc{p}.\textsc{f}-\textsc{aux}-\textsc{imprf} book.\textsc{lmo}-\textsc{lat} bring.\textsc{m}.\textsc{pst}// \glft `The teacher brought the girl the book.'// \endgl \xe ``` ### Accessing data using Python or R The data in `languages.csv` can be used easily with Python or R etc. The following code block loads `languages.csv` into a data frame called `languages` using `pandas`. ```python > import pandas > languages = pandas.read_csv('languages.csv', sep=';') > languages Language Original Gloss Translation ... TPrs TNbr TGen TCase 0 Hungarian Mari lát-ja a könyv-et. Mari see-3SG.SBJ>3SG.OBJ the book-ACC Mari sees the book. ... 3 SG NaN ACC 1 Hungarian Mari neked ad-ja a könyv-et. Mari 2SG.DAT give-3SG.SBJ>3SG.OBJ the book-ACC Mari gives you the book. ... 3 SG NaN ACC 2 Gorwaa mwalimú kitaabú ng-u-∅-(g)a hariís dír desír teacher.LMo book.LMo A.3-P.M-AUX-PRF bring.M.P... The teacher brought the book to the girl. ... 3 SG M NOM 3 Gorwaa mwalimú desír ng-a-∅-na kitaabú-i hariís teacher.LMo girl.LFR A.3-P.F-AUX-IMPRF book.LM... The teacher brought the girl the book. ... 3 SG M LAT 4 Kapampangan Mamye (ya)ng tela ing mestra kareng babai. give cloth the teacher to women The teacher will give cloth to the women. ... 3 SG NaN ABS .. ... ... ... ... ... ... ... ... ... 803 Yukulta  tʸina-ŋka ṭat̪int ṭaŋka-ŋala-pakarі miyaḷṭa y... where-PRES that+ABS man-ŋala-you+TR+PRES spear... Where's that man who gave you the spear? ... 3 SG NaN ABS 804 Yurok nek nahci-s-ek' ci·k I give-3SG-1SG money I gave him money ... 3 SG NaN NOM 805 Yurok nek nahci-s-ek' I give-3SG-1SG I give it to him ... 3 SG NaN NaN 806 Yurok nek nahci-s-ek' ku cey I give-3SG-1SG child I give it to the child ... 3 SG NaN NaN 807 Zulu uMandla u- bona [ukuthi ngi- ya- m- thanda] [u... AUG.1Mandla 1S- see that 1SG- YA- 1O- like  wh... Mandla sees that I like him when I give him pr... ... 3 PL NaN NOM [808 rows x 24 columns] ``` There are 139 examples with neutral case and secundative agreement alignment (NCSA) in which the recipient's person (the agreement controller in secundative agreement) is first person: ```python > languages.loc[(languages['Type'] == 'NCSA') & (languages['RPrs'] == '1')] Language Original Gloss ... TNbr TGen TCase 142 Movima kɑyɬe:-kɑy--isne is lɑwɑ:jes give-INV-f.a ART.pl remedy ... SG NaN NOM 165 Squamish mi-ši-t-c-ka kʷi stáqʷ! come-RDR-TR-1.SG.OBJ-IMP DET water ... SG NaN NOM 257 Jingulu Ngunyɑ-ɑnɑ-mi ngɑmɑniki-rni milɑkurrmi-rni, ng... give-1O-IRR this(v)-FOC yam-FOC give-1O-IRR ... PL NaN ABS 321 Rembarnga tiŋʔ - yiʔ ŋinta - Ø  ŋana - pak - larayʔ - mi... woman - ERG 1min.PRON - NOM 1min.IMPL + 3aug.A... ... SG NaN ABS(NOM) 332 Ainu A-en-kore. 2HON-1SG-give ... SG NaN NaN .. ... ... ... ... ... ... ... 797 Yaneshaʹ (Amuesha) Añach yetsom ñeñt̃ p̃-apa'-nmu-ey.  … 2SG-give-?-1PL ... SG NaN NaN 798 Yaneshaʹ (Amuesha) an̄ ye-po·s- aˀt-e·n- a ñeñtʸ pʸ- ahp- aˀn- ... this 1PL-be drunk-EP- PROG-REFLX which 2SG-giv... ... PL NaN NOM 799 Yukulta palmpiya-ŋalawa-yiniŋki wu:tʸa tomorrow-us(OBL)-you(NOM)+FUT give(Vtr)+IND ... SG NaN NaN 800 Yukulta palmpiya-t̪u-n̩iŋki wu:tʸa tomorrow-me(OBL)-he(NOM) give+IND ... SG NaN NaN 801 Yukulta palmpiya-nk-i-kant i wu:tʸa tomorrow-me(ACC)-you(NOM)-TR+FUT give+IND ... SG NaN NaN [139 rows x 24 columns] ``` To get the same information in R using `tidyverse`, you can do the following: ```R > library('tidyverse') > languages <- read_csv2('languages.csv') > languages %>% filter(Type == 'NCSA' & RPrs == '1') # A tibble: 139 × 24 Language Original Gloss Translation Type Remarks Source Citekey Page Exno AddSourceInfo Predicate SbjPrs SbjNbr SbjGen SbjCase RPrs RNbr RGen RCase TPrs TNbr TGen TCase <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> 1 Movima kɑyɬe:-kɑy--isne is lɑwɑ:jes give-INV-f.a… She gave m… NCSA NA Haude… Haude2… 404 162 {DM, Fracaso… give 3 SG NA NA 1 SG NA NA 3 SG NA NOM 2 Squamish mi-ši-t-c-ka kʷi stáqʷ! come-RDR-TR-… Bring me s… NCSA NA Kuipe… Kiyosa… 50 47 NA bring (i… NA NA NA NA 1 SG NA NA 3 SG NA NOM 3 Jingulu Ngunyɑ-ɑnɑ-mi ngɑmɑniki-rni milɑkurrmi-rni, ngunyɑ-ɑnɑ-mi! give-1O-IRR … Give me th… NCSA Pensal… Pensa… Pensal… 107 4.46k NA give (im… NA NA NA NA 1 SG NA NA 3 PL NA ABS 4 Rembarnga tiŋʔ - yiʔ ŋinta - Ø  ŋana - pak - larayʔ - miɲ ţeɲ -  Ø  woman - ERG … The women … NCSA NA McKay… McKay1… 298 (3.4… NA cook 3 PL NA ERG 1 SG NA NOM 3 SG NA ABS(… 5 Ainu A-en-kore. 2HON-1SG-give You (HON) … NCSA NA Shiba… Shibat… 56 94c NA give 2 HON NA NA 1 SG NA NA 3 SG NA NA 6 Ainu Ku-cis-kor sonno en-erɑmpokinu, beko tope poronno en-kore. 1SG-cry PROG… I was cryi… NCSA NA Shiba… Shibat… 86 7 NA give 3 SG NA NA 1 SG NA NA 3 SG NA NOM 7 Apurinã pu-sukɑ-no notɑ 2SG-give-1SG… Give away … NCSA potent… Facun… Facund… 290 20a NA give (im… 2 SG NA NA 1 SG NA NOM NA NA NA NA 8 Bagirmi N-ád-ūm jā mɨ̀-sáà. he-gave-me m… He gave me… NCSA NA Keega… Keegan… 17 NA NA give 3 SG NA NA 1 SG NA NA 3 SG NA NOM 9 Bagirmi ád-ūm jó nén kūyú. give-me to o… Give me an… NCSA NA Keega… Keegan… 25 NA NA give (im… NA NA NA NA 1 SG NA NA 3 SG NA NOM 10 Bagirmi ād-ūm kāɗ-mbī kéɗē. give-me spoo… Give me a … NCSA NA Keega… Keegan… 27 NA NA give (im… NA NA NA NA 1 SG NA NA 3 SG NA NOM # … with 129 more rows ``` ### Empty cells Not all cells are filled. Empty cells represent missing information from the data points. If an argument is not expressed as a full NP, for example, but only as an agreement marker, its value in the `Case` column is simply missing. In contrast, for transitives, the values for the recipient (`RPrs`, `RNbr`, `RGen`, `RCase`) are `/`. ## Sources In addition to the data itself, the file `sources.bib` contains the bibliographical information of the sources listed in the `citekey` column of `languages.csv`. ## Publications using (part of) the data set Bárány, András. 2021. [A typological gap in ditransitive constructions: No secundative case and indirective agreement](https://www.lingref.com/cpp/wccfl/38/abstract3549.html). In Rachel Soo, Una Y. Chow & Sander Nederveen (eds.), *Proceedings of the 38th West Coast Conference on Formal Linguistics*, 43–53. Somerville, MA: Cascadilla Proceedings Project. ## Using the data If you are interested in using any of the data in this data set, please cite (and double check) the original source. If you are using the data set as a whole, feel free to acknowledge its use by citing it: ```latex @online{BaranyClasse2022, author = {Bárány, András and Classe, Nora-Friederike}, title = {Data set of ditransitive alignment patterns}, year = {2022}, url = {https://osf.io/k386x/}, doi = {10.17605/OSF.IO/K386X}, } ``` Bárány, András & Classe, Nora-Friederike. 2022. *Data set of ditransitive alignment patterns*. DOI: [10.17605/OSF.IO/K386X](https://doi.org/10.17605/OSF.IO/K386X). https://osf.io/k386x/.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.