Main content
LASTU /
datasets
Date created: | Last Updated:
: DOI | ARK
Creating DOI. Please wait...
Category: Data
Description: Finnish datasets for LASTU. License: CC-BY-SA 4.0. Derived from Finnish Internet Parsebank: J. Luotolahti; J. Kanerva; V. Laippala; S. Pyysalo; F. Ginter. Towards Universal Web Parsebanks. Proceedings of the International Conference on Dependency Linguistics (Depling’15). 2015 https://aclanthology.org/W15-2124/ https://turkunlp.org/finnish_nlp.html When this database is used, it should be cited as Luotolahti et al. (2015). Filename schema: lang_source_tokens_minfreq.db, where - lang: the language (e.g., fi, es) - source: the data source (e.g., parsebank, tdt) - tokens: gross token amount (e.g., 50M, 2B) - minfreq: minimum frequency, or "full" if not applicable (e.g., 10)
Files
Files can now be accessed and managed under the Files tab.
Citation
Recent Activity
Unable to retrieve logs at this time. Please refresh the page or contact support@osf.io if the problem persists.