Main content

Date created: | Last Updated:

: DOI | ARK

Creating DOI. Please wait...

Create DOI

Category: Data

Description: A dataset for investigating automatic speech recognition in the domain of spoken programming languages.

Wiki

This dataset includes:

  1. Transcriptions of programmers speaking single or a few lines of Java code along with the associated actual programming statements.

  2. Single line comments extracted from the CodexGlue dataset CodeXGLue. The CodexGlue dataset is derived from the curated CodeSearchnet dataset which was used for a code summarization task.

  3. A 4-gram word level mixture language model created by…

Files

Files can now be accessed and managed under the Files tab.

Citation

Tags

language modelingprogramming languagesspeech recognitionvoice programming

Recent Activity

Unable to retrieve logs at this time. Please refresh the page or contact support@osf.io if the problem persists.

OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
Accept
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.
Accept
×

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.