Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
This repository contains some of the materials used to prepare my talk at the [Moral Media meeting](https://moralmedia.org/annual-meeting/) at Ohio State University in September 2018. The slides, in their HTML format, are also here and can also be accessed at [this link](https://jacob-long.com/slides/MM18-slides.html). ## What is this The project involved collecting Billboard music charts in a few genres since 1992, retrieving the lyrics for those songs, and content analyzing them. In particular, I used the [Moral Foundations Dictionary](https://moralfoundations.org/othermaterials) as well as an ad hoc political words dictionary. In addition to the conventional word count analysis, I used [distributed dictionary representations](https://link.springer.com/article/10.3758%2Fs13428-017-0875-9) (DDR) to quantify the moral and political content of the lyrics. ## Use The repository is self-sufficient insofar as you should be able to generate the slides from `Slides.Rmd` with R and interact with the processed data (stored in `charts.rds`) to explore as you wish. The variable names should be fairly self-explanatory if you know Moral Foundations Theory well enough to be interested in this project. The `.ddr` suffixes differentiate the DDR loadings from the dictionary frequencies. Look to the `Slides.Rmd` file for some example code for converting the code from the format in `charts.rds` --- which is one song-chart-week per row --- to other shapes and levels, like one genre-chart-week per row. ## Raw data There are two raw data sources that I cannot include: * The raw lyrics data, because I do not own the copyright and so cannot distribute it. * The training corpus for the DDR, due to its size. You can check out [my data scraper](https://github.com/jacob-long/Song-and-Lyric-Data-Scraper) to collect your own raw lyrics data if you'd like. You can then use the `sqlite_to_R.R` and then `raw_to_counts.R` scripts to get the raw data in the same format needed to replicate the slides with `Slides.Rmd`. The training corpus I used is available at [this link](https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit). Bear in mind it is over 1GB. To install the DDR Python module, run `python setup.py install` from the DDR directory. Run `python ddr-script.py` from this project's directory on the command line to generate the DDR loadings after you have created the `unique_songs_raw.csv` file from the `raw_to_counts.R` script. Of course, I will again say that you can skip all of this and just use the `charts.rds` file if you want the data with only the dictionary counts and DDR loadings.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.