Home

Menu

Loading wiki pages...

View
Wiki Version:
# Quantifying Lexical Ambiguity ## Repository Overview * `code/` * `code/notebooks` contains any notebooks used to generate reports (including type counts, figures, tables, etc) * `code/scripts` contains any processing script used in the analysis pipeline * `code/functions` contains any helper functions used for `code/notebooks` notebooks or `code/scripts` scripts. They should be numbered to match their corresponding script * `code/reports` contains any notebooks used to generate any images for this submission ## Tools and Dependencies This project requires R version 4.0.3 and Python 3.8.6. Python is called using the `reticulate` package, which calls the python environment included in this repository. If one wishes to use their own python environment, they can create a python 3.8.6 environment and install the required python packages with `pip install -r requirements.txt` and repoint the `reticulate` package to your environment. ## Analysis Pipeline The analysis pipeline is as follows: * `notebooks/00_do_full_preprocessing.Rmd`: Reads in either a live connection to our tag database or from raw csv files. * `notebooks/01a_semcor_tags.ipynb`: Collects the tags from SemCor using the NLTK corpus reader * `notebooks/01b_process_semcor_tags.Rmd`: Preprocesses the semcor tags to match the preprocessing steps from our data (eg. lemmatization, WordNet sense matching, filtering) * `notebooks/02_interpolatedSenseCounts.ipynb`: Generates the subsampled counts for the target types * `scripts/05_WordSense_dirichletMultinomial.R --analysis semcor`: executes the model over Adult directed vs Child directed speech (analysis 1a) * `scripts/05_WordSense_dirichletMultinomial.R --analysis adultsVsChildren`: executes the model over Adult produced vs Child produced speech (analysis 1b) * `notebooks/05_WordSense_DirchletMultinomial.ipynb`: Runs the analysis and processing over the results from 1a and 1b above * `notebooks/03_AOFP.ipynb`: Uses the resulting entropy estimations to predict Age of First Production
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.