Description: People use some numbers more than others. For example, people tend to discuss smaller numbers more than larger numbers, and round numbers more than unround numbers of a similar magnitude. We identify over 1.5 million numbers from 0 to 1 billion in the British National Corpus and observe how often each number appears. These numbers were represented as numerals (e.g., ‘100’), number words (e.g., ‘three thousand and two’), or a mixture of both (e.g., ‘2 million’). Our results demonstrate that frequency declines with numerical magnitude, but that round numbers are used more frequently than unround numbers of a similar magnitude. At higher magnitudes, people round to a greater extent – for example, rounding 99 to 100, but rounding 899 to 1000. Our results suggest that round numbers are not created equal: those with more properties associated with roundness (e.g., divisibility by 10, 100, 1000, etc. results in an integer from 1 to 9) are used more often than those with fewer roundness properties. We also find that some of these properties are more important predictors of roundness than others. In general, we find that formal registers of speech and writing are more numerically diverse (people use numbers more often) and diverse (people use more varied numbers). Formal registers also contain more decimals and are less biased toward smaller numbers. In writing, we also find that the numbers 1–9 are mostly represented as number words, 10–999,999 are mostly represented as numerals, and 1 million–1 billion are mostly represented in a mixed format. This study sheds light on how people discuss numbers in spoken and written English on a large scale.


