<p><strong>Paper:</strong></p>
<p><a href="https://doi.org/10.1371/journal.pone.0194290" rel="nofollow">https://doi.org/10.1371/journal.pone.0194290</a></p>
<p><strong>Data:</strong></p>
<p>The data consists of three files:</p>
<ul>
<li><em>outcomes</em>: dense table with county drinking and socio-demographic variables</li>
<li><em>topics</em>: topic frequencies for each county in sparse format</li>
<li><em>1grams</em>: word frequencies for each county in sparse format</li>
</ul>
<p><strong>Data Files:</strong></p>
<p>Data is available in both CSV and MySQL formats:</p>
<ul>
<li>CSV</li>
<li>outcomes.csv</li>
<li>feat.cat_met_a30_2000_cp_w.msgs_2011to13.cnty.16to16.csv.zip</li>
<li>feat.1gram.msgs_2011to13.cnty.16to16.0_1.csv.zip</li>
<li>MySQL</li>
<li>county_drinking_plosone2018.sql.zip</li>
</ul>
<p><strong>Analysis:</strong></p>
<p>All analysis was run using the <a href="http://dlatk.wwbp.org" rel="nofollow">DLATK Python package</a>. This package uses MySQL so data is made available in a single, convenient SQL dump. </p>