Main content



Loading wiki pages...

Wiki Version:
Abstract: In this paper, we report a qualitative and quantitative evaluation of a hand-crafted set of discourse features and their interaction with different text types. To be more specific, we compared two distinct text types --- scientific abstracts and their accompanying full texts --- in terms of linguistic properties, which include, among others, sentence length, coreference information, noun density, self-mentions, noun phrase count, and noun phrase complexity. Our findings suggest that abstracts and full texts differ in three mechanisms which are size and purpose bound. In abstracts, nouns tend to be more densely distributed, which indicates that there is a smaller distance between noun occurrences to be observed because of the compact size of abstracts. Furthermore, in abstracts we find a higher frequency of personal and possessive pronouns which au- thors use to make references to themselves. In contrast, in full texts we observe a higher frequency of noun phrases. These findings are our first attempt to identify text type motivated linguistic features that can help us draw clearer text type boundaries. These features could be used as pa- rameters during the construction of systems for writing evaluation that could assist writing tutors in text analysis, or as guides in linguistically- controllable neural text generation systems.
OSF does not support the use of Internet Explorer. For optimal performance, please switch to another browser.
This website relies on cookies to help provide a better user experience. By clicking Accept or continuing to use the site, you agree. For more information, see our Privacy Policy and information on cookie use.

Start managing your projects on the OSF today.

Free and easy to use, the Open Science Framework supports the entire research lifecycle: planning, execution, reporting, archiving, and discovery.