Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
# Political Astroturfing across the World ### David Schoch, Franziska B. Keller, Sebastian Stier, JungHwan Yang ## Replication materials The data and code provided here enables the replication of all analyses in ***Political Astroturfing across the World***. ## Content The archive contains raw and aggregated data, R scripts and bash scripts underlying the analysis in the paper and supplementary material. ## R packages Most of the analyses were done in R. The used packages are listed below together with the code to install them. ```{r libs, eval =FALSE} rm(list=ls()) if (!requireNamespace("pacman", quietly = TRUE)) { install.packages(pacman) } if (!requireNamespace("remotes", quietly = TRUE)) { install.packages(remotes) } pacman::p_load(tidyverse, rvest, progress, data.table, igraph, ggraph, patchwork update = FALSE) ``` ## Scripts ### pre-clean | script | description| |---|--| |`del_newlines.sh` | delete new line character within tweets |`remove_null.sh` | remove embedded nulls ### Analysis | script | description| |---|--| |`00_helper.R` | helper functions |`01_download.R` | download the twitter release data |`02_clean.R` | clean the downloaded data |`02a_clean_random.R` | clean the random sample data (random sample data not provided) |`03_clean_huge.R` | clean the data of campaigns that are larger than 4GB |`04_basic_stats.R` | compute basic statistics |`05_time_stats.R` | compute stats related to time and date |`06_network_data.R` | compute network data (cotweet, coretweet, retweet) for campaigns |`06a_network_data_location.R` | compute network data (cotweet, coretweet, retweet) for random sample based on location |`06b_network_data_hashtag.R` | compute network data (cotweet, coretweet, retweet) for random sample based on hashtags |`07_heatmap_data.R` | compute data to produce activity heatmaps |`08_detectable_data.R` | compute the fraction of detectable accounts |`09_densities.R` | compute network densities |`retweets.sh` | bash script called to compute retweets |`coordination.sh` | bash script called to compute co(re)tweets (complete) |`coordination-sparse.sh` | bash script called to compute co(re)retweets (only 1 minute) |`dec2int.cpp` | C++ code to turn decimal into binary and back ### Plotting | script | description| |---|--| |`p00_misc.R` | mostly summary statistics |`p01_timeseries.R` | code to produce time series plots |`p02_coordination_stats.R` | visualize stats of coordination patterns |`p03_comparison_w_random.R` | visualize comparison with random samples |`p04_network_for_paper.R` | illustrative network plots |`p05_SI_astro_vs_randos.R` | comparisons for the SI |`p06_detected_accounts.R` | visualize the detected accounts |`p07_densities.R` | visualize network densities |`p08_detecbar.R` | visualize detectable as bar charts of all campaigns ## Directory structure The code assumes the following folder structure (all folders are empty, except the bold ones) - Project folder - **raw_data** (contains an html file to download all raw data) - random_samples - hashtag_samples - campaigns - **bash** (contains bash scripts mentioned above) - **Rscripts** (contains R scripts mentioned above) - processed_data - random_samples - hashtag_samples - campaigns - aggregated_data - random_samples - hashtag_samples - campaigns - **RData** (contains some aggregated data as RDS file) ## System requirements All analyses where run on a Ubuntu 20.04 ThinkPad with 16GB of RAM and R version 4.0.2. ## Data files While Twitter’s terms and conditions prohibit sharing raw data, we share download scripts and tweet IDs that enable researchers to reconstruct the datasets.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.