SiClaEn

Contributors:

Date created: | Last Updated:

: DOI | ARK

Creating DOI. Please wait...

Create DOI

Category: Project

Description: Algorithm to classify documents of a resource poor language by means of data and tools from a resource rich language

Wiki

SiClaEn dataset contains a Reuters English News DataSet and a Sinhala News DataSet. The Sinhala News DataSet was collected from bi-lingual Sinhala and English news sources such as AdaDerana and NewsFirst. The Reuters English News DataSet has 7103 sentences in 383 posts and the Sinhala News DataSet has 5221 sentences in 471 posts. All datasets are categorized pertaining to thefollowing topics; busi...

Files

Loading files...

Citation

Tags

Recent Activity

Loading logs...

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.