Main content

Home

Menu

Loading wiki pages...

View
Wiki Version:
This dataset contains the retweet networks and the tweet ids of the tweets used in the article: Bovet, A. & Makse, H. A. Influence of fake news in Twitter during the 2016 US presidential election. Nat. Commun. 10, 7 (2019). [doi:10.1038/s41467-018-07761-2](https://doi.org/10.1038/s41467-018-07761-2) The classification of news outlets in the different media categories is a matter of opinion, rather than a statement of fact. This opinion originated in publicly available datasets from fact-checking organizations, i.e. www.opensources.co (copy at https://github.com/alexbovet/opensources), www.mediabiasfactcheck.com & www.allsides.com. This classification of news media should not be interpreted as representing the opinions of the authors of the article. ----------------------------------------- There are 8 compressed csv files that contain the tweet ids of each tweet we collected from June 1st 2016 until November 9th 2016 (including original tweets of retweets) that contained at least one URL directing toward a news outlet website of the corresponding media category. - `tweet_id_list_fake_news.csv.gz` - `tweet_id_list_right.csv.gz` - `tweet_id_list_center.csv.gz` - `tweet_id_list_lean_left.csv.gz` - `tweet_id_list_extreme_bias_left.csv.gz` - `tweet_id_list_lean_right.csv.gz` - `tweet_id_list_extreme_bias_right.csv.gz` - `tweet_id_list_left.csv.gz` There are 8 compressed csv files that contain the corresponding retweet graphs where each line represents one edge of the graph (including parallel edges) in the form: "source_id,target_id,tweet_id". The source_id and target_id are anonymized user IDs. The edge direction represents the flow of information, i.e. an edge from source_id to target_id represent the fact that target_id retweeted a tweet posted by source_id. - `retweet_graph_fake_news.csv.gz` - `retweet_graph_right.csv.gz` - `retweet_graph_center.csv.gz` - `retweet_graph_lean_left.csv.gz` - `retweet_graph_extreme_bias_left.csv.gz` - `retweet_graph_lean_right.csv.gz` - `retweet_graph_extreme_bias_right.csv.gz` - `retweet_graph_left.csv.gz` The media categories and the news outlets in each category are detailed in our article (see Supplementary Table 1 of the article). The tweet id lists contain duplicated tweet id whenever there was more than one URL linking to a news outlet in a tweet. The retweet networks contain parallel edges whenever a user retweeted another user more than once. Softwares such as [hydrator](https://github.com/DocNow/hydrator) and [tweepy](https://www.tweepy.org/) can be used to “rehydrate” the tweet_IDs, i.e. download the full tweet objects using the tweet_IDs.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.