Wattpad titles corpus

Date created: | Last Updated:

: DOI | ARK

Creating DOI. Please wait...

Create DOI

Category: Data

Description: This is collection of all the stories' titles published on Wattpad at the date: January 2018. It's a corpus of around 30 millions titles in more than 50 different languages. It includes mainly original fiction and a small part of fan fiction (roughly 10%). The R Markdown files regarding the procedures for network analysis and sentiment analysis can be found in the GitHub repository: https://github.com/SimoneRebora/Wattpad_analysis We published an article based on this data https://doi.org/10.1371/journal.pone.0226708

License: CC-By Attribution 4.0 International

Wiki

This project contains the data, the code and the results of some analysis on stories published on wattpad.com. The corpus reflects the state of Wattpad as per January 2018, based on the sitemap files found on the server. Thus, it is not a complete dataset of all Wattpads' stories. More information about corpus building, analyses, and results can be found in the article: Pianzola, F., Rebora, S, an...

Files

Loading files...

Citation

Tags

Recent Activity

Loading logs...

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.