SiClaEn dataset contains a Reuters English News DataSet and a Sinhala News DataSet. The Sinhala News DataSet was collected from bi-lingual Sinhala and English news sources such as AdaDerana and NewsFirst. The Reuters English News DataSet has 7103 sentences in 383 posts and the Sinhala News DataSet has 5221 sentences in 471 posts. All datasets are categorized pertaining to thefollowing topics; business, entertainment, politics, Science& technology, and sports.