Home

Menu

Loading wiki pages...

View
Wiki Version:
<p>This repository contains some of the materials used to prepare my talk at the <a href="https://moralmedia.org/annual-meeting/" rel="nofollow">Moral Media meeting</a> at Ohio State University in September 2018. The slides, in their HTML format, are also here and can also be accessed at <a href="https://jacob-long.com/slides/MM18-slides.html" rel="nofollow">this link</a>.</p> <h2>What is this</h2> <p>The project involved collecting Billboard music charts in a few genres since 1992, retrieving the lyrics for those songs, and content analyzing them. In particular, I used the <a href="https://moralfoundations.org/othermaterials" rel="nofollow">Moral Foundations Dictionary</a> as well as an ad hoc political words dictionary. In addition to the conventional word count analysis, I used <a href="https://link.springer.com/article/10.3758%2Fs13428-017-0875-9" rel="nofollow">distributed dictionary representations</a> (DDR) to quantify the moral and political content of the lyrics. </p> <h2>Use</h2> <p>The repository is self-sufficient insofar as you should be able to generate the slides from <code>Slides.Rmd</code> with R and interact with the processed data (stored in <code>charts.rds</code>) to explore as you wish. </p> <p>The variable names should be fairly self-explanatory if you know Moral Foundations Theory well enough to be interested in this project. The <code>.ddr</code> suffixes differentiate the DDR loadings from the dictionary frequencies.</p> <p>Look to the <code>Slides.Rmd</code> file for some example code for converting the code from the format in <code>charts.rds</code> --- which is one song-chart-week per row --- to other shapes and levels, like one genre-chart-week per row.</p> <h2>Raw data</h2> <p>There are two raw data sources that I cannot include:</p> <ul> <li>The raw lyrics data, because I do not own the copyright and so cannot distribute it.</li> <li>The training corpus for the DDR, due to its size.</li> </ul> <p>You can check out <a href="https://github.com/jacob-long/Song-and-Lyric-Data-Scraper" rel="nofollow">my data scraper</a> to collect your own raw lyrics data if you'd like. You can then use the <code>sqlite_to_R.R</code> and then <code>raw_to_counts.R</code> scripts to get the raw data in the same format needed to replicate the slides with <code>Slides.Rmd</code>.</p> <p>The training corpus I used is available at <a href="https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit" rel="nofollow">this link</a>. Bear in mind it is over 1GB.</p> <p>To install the DDR Python module, run <code>python <a href="http://setup.py" rel="nofollow">setup.py</a> install</code> from the DDR directory. Run <code>python <a href="http://ddr-script.py" rel="nofollow">ddr-script.py</a></code> from this project's directory on the command line to generate the DDR loadings after you have created the <code>unique_songs_raw.csv</code> file from the <code>raw_to_counts.R</code> script.</p> <p>Of course, I will again say that you can skip all of this and just use the <code>charts.rds</code> file if you want the data with only the dictionary counts and DDR loadings. </p>
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.