Symbolic Number Skills Predict Growth in Nonsymbolic Number Skills in Kindergarteners

There is currently considerable discussion about the relative influences of evolutionary and cultural factors in the development of early numerical skills. In particular, there has been substantial debate and study of the relationship between approximate, nonverbal (approximate magnitude system [AMS]) and exact, symbolic (symbolic number system [SNS]) representations of number. Here we examined several hypotheses concerning whether, in the earliest stages of formal education, AMS abilities predict growth in SNS abilities, or the other way around. In addition to tasks involving symbolic (Arabic numerals) and nonsymbolic (dot arrays) number comparisons, we also tested children’s ability to translate between the 2 systems (i.e., mixed-format comparison). Our data included a sample of 539 kindergarten children (M = 5.17 years, SD = .29), with AMS, SNS, and mixed-comparison skills assessed at the beginning and end of the academic year. In this way, we provide, to the best of our knowledge, the most comprehensive test to date of the direction of influence between the AMS and SNS in early formal schooling. Results were more consistent with the view that SNS abilities at the beginning of kindergarten lay the foundation for improvement in both AMS abilities and the ability to translate between the 2 systems. It is important to note that we found no evidence to support the reverse. We conclude that, once one acquires a basic grasp of exact number symbols, it is this understanding of exact number (and perhaps repeated practice therewith) that facilitates growth in the AMS. Though the precise mechanism remains to be understood, these data challenge the widely held view that the AMS scaffolds the acquisition of the SNS.

In the search for the origins of numerical abilities, there has been a debate regarding the extent to which basic, possibly innate, skills that humans share with nonhuman animals serve as the foundation for more complex skills that are taught more deliberately via cultural practices and, in particular, via formal education. Conversely, it is of considerable interest whether cultural inputs may shape and refine more evolutionarily basic abilities.
In the domain of numerical cognition and mathematics education, a major current debate concerns the interplay between the ability to discriminate between approximate, nonverbal magnitudes (such as arrays of dots) and the ability to represent numbers in exact, symbolic form (e.g., as with Indo-Arabic numerals). The former, approximate and nonverbal capacity is shared across many species (for reviews, see Agrillo & Beran, 2013;Nieder & Dehaene, 2009;Pahl, Si, & Zhang, 2013) and, at least in basic form, is thought to be present from birth in humans (e.g., Lipton & Spelke, 2003;Xu & Spelke, 2000). Recently, it has become apparent that humans and other species can do more than just compare nonverbal magnitudes; they can even perform simple, approximate arithmetic (such as sums and ratios; e.g., Brannon, Wusthoff, Gallistel, & Gibbon, 2001;Capaldi & Miller, 1988;Matthews, Lewis, & Hubbard, 2016;McCrink & Spelke, 2010. Such abilities are thought to be underpinned by what is often referred to as the approximate number system, or ANS. There is some debate over the extent to which these approximate magnitudes are strictly numerical (e.g., Leibovich & Henik, 2013); hence, for present purposes, we adopt the broader term: approximate magnitude system (AMS).
Number symbols are cultural in origin, because they rely on arbitrary social conventions to determine their forms and the basic rules that govern how they are to be mathematically manipulated (Zhang & Norman, 1995). We refer here to the ability to represent and manipulate number symbols as the symbolic number system, or SNS. Over the course of development, children go from viewing symbols as meaningless shapes to having a rich understanding of their meaning. A key question is how children develop an understanding of the meaning of number symbols and to what representations these symbols become linked during this ontogenetic process.
One view that has garnered considerable attention is that the AMS plays a crucial role as the foundation for and scaffold of the SNS (Dehaene, 1997(Dehaene, , 2008Feigenson, Dehaene, & Spelke, 2004;Feigenson, Libertus, & Halberda, 2013;Gallistel & Gelman, 2000;Piazza, 2010). In other words, the evolutionarily basic capacity, in the form of the AMS, is thought to provide the critical foundation for the culturally acquired capacity (the SNS). This is an intuitive hypothesis, capitalizing on the broader notion that cultural skills must at some level coopt extant, evolutionarily ancient, neural structures (Dehaene & Cohen, 2007). Moreover, many of the basic operations that form the basis of the SNS-such as relative quantity (greater-lesser), ordinality, and even simple arithmetic-are available, at least in approximate form, to the AMS (e.g., Brannon et al., 2001;Capaldi & Miller, 1988;Matthews et al., 2016;McCrink & Spelke, 2010. Furthermore, numerous studies have shown that the precision of an individual's AMS is predictive of SNS abilities (for a review and meta-analysis, see Chen & Li, 2014). There is also work to suggest that arithmetic training using approximate, nonsymbolic magnitudes (arrays of dots) leads to improvement in symbolic calculation scores (Hyde, Khanum, & Spelke, 2014;Park & Brannon, 2013. Taken together, these results strongly suggest that one's early formal understanding of the SNS may be bootstrapped from the informal AMS. A straightforward prediction is that children who begin kindergarten with stronger AMS skills should be best positioned to acquire early formal SNS skills during the school year: Children with strong AMS skills at the beginning of the year should show the strongest growth in SNS skills over the course of the year. An alternative view is that once one acquires a basic grasp of exact number symbols, it is this exact understanding of number (and repeated practice therewith) that facilitates growth in the AMS. In other words, approximate, nonverbal quantities increasingly come to be understood in symbolic terms (Mix, 2008;Mix, Huttenlocher, & Levine, 1996;Mix, Huttenlocher, & Levine, 2002;Mussolin, Nys, Leybaert, & Content, 2016). From this perspective, it is the culturally acquired SNS that refines the evolutionarily more basic capacity (the AMS). Though perhaps somewhat counterintuitive, there is evidence to support the notion that cultural inputs may have an influence on the AMS. For example, Piazza, Pica, Izard, Spelke, and Dehaene (2013) examined adults and children among the Mundurucú (an indigene group in the Amazon) and found that those with access to education showed higher AMS acuity-specifically, higher precision when comparing arrays of dots. No difference as a function of education was found on a task requiring participants to determine which of two disks was larger in area. These results suggest that educational inputs may increase the precision of the evolutionarily more basic capacity to process approximate magnitudes. That said, one cannot rule out the possibility that those with higher AMS acuity were predisposed to seek out educational opportunities. Moreover, it is not clear precisely what educational inputs (e.g., math education in particular or socialization aspects of the educational environment) may have been responsible for the observed difference in AMS acuity. Mussolin, Nys, Content, and Leybaert (2014) recently provided evidence suggesting that symbolic numerical skills in particular predict improvement in AMS acuity.  at two time points (roughly seven months apart). They showed that scores on a symbolic numerical battery at Time 1 significantly predicted growth in AMS acuity (i.e., AMS acuity at Time 2, after controlling for AMS acuity at Time 1). It is interesting that the reverse relation was both nonsignificant and significantly smaller (i.e., SNS1 ϳ ⌬AMS Ͼ AMS1 ϳ ⌬SNS), suggesting the direction of influence runs specifically from the SNS to the AMS and not the other way around (for a similar result in a sample of 30 first graders, see also Matejko & Ansari, 2016).

Linking Symbolic and Nonsymbolic Numerical Processing
Whether early numerical influences flow primarily from the AMS to the SNS or the other way around, both accounts require some mechanism for linking the two systems to transmit this influence. From the AMS ¡ SNS perspective, one such suggestion is that approximate representations of quantity in fact serve as the semantic content of their symbolic counterparts (e.g., Dehaene, 2008;Feigenson et al., 2004Feigenson et al., , 2013Piazza, 2010;Piazza, Pinel, Le Bihan, & Dehaene, 2007;Verguts & Fias, 2004). This view presupposes a strong link between the AMS and SNS more or less from the time one begins to acquire number symbols. On the other hand, from the SNS ¡ AMS perspective, one need not presuppose a direct link between the SNS and AMS even early in development. Hence, an important additional piece of the debate is the need for and possible development of an ability to translate between symbolic and approximate numerical representations (Brankaer, Ghesquière, & De Smedt, 2014;Mundy & Gilmore, 2009).
One way to probe this translational ability is to ask participants to match or compare quantities across (symbolic and nonsymbolic) formats. In a study with adults, Lyons, Ansari, and Beilock (2012) demonstrated that the ability to compare a symbolic (numeral) stimulus with a nonsymbolic (dot array) stimulus is nontrivial: Mixed-format comparisons took significantly longer than did either symbolic or nonsymbolic same-format comparisons. The cost of mixing numerals and dots was also significantly higher than the cost of mixing numerals and number-words, suggesting the key distinction is not necessarily visual format but whether the stimuli point to symbolic (SNS) or nonsymbolic (AMS) representations. Recent neural evidence has also provided evidence consistent with this distinction (Bulthé, De Smedt, & Op de Beeck, 2014;Bulthé, De Smedt, & Op de Beeck, 2015;Damarla, Cherkassky, & Just, 2016;Damarla & Just, 2013;Lyons, Ansari, & Beilock, 2015). In other tasks that force one to translate between AMS stimuli and verbal symbols (number-words), one finds systematic biases (Crollen, Castronovo, & Seron, 2011;. On the other hand, each of these studies-including Lyons et al. (2012)-used literate adults as participants. Thus, the distinctionseparation between the AMS and SNS may arise only over time, after considerable educational experience has perhaps shifted the focus onto symbolic representations of number. Indeed, Lyons et This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
al. strongly implied that their results were likely specific to literate adults and suggested that the AMS and SNS may be strongly linked early in development and become "estranged" only later in development. This proposal is more in keeping with the AMS ¡ SNS perspective because it assumes a central role for the AMS in shaping early SNS representation. From this perspective-in particular, assuming the early meaning of number symbols is derived directly from their nonsymbolic counterparts (e.g., Dehaene, 2008;Feigenson et al., 2004;Piazza et al., 2007;Verguts & Fias, 2004)-one would expect the cost of mixing numerals and dots to be minimal in younger children, with this cost increasing over developmental time as the two systems (AMS and SNS) become increasingly distinct. An alternative view is that the AMS and SNS are actually formed more or less independently of one another (Carey, 2011;Le Corre & Carey, 2007;. This would imply a high cost of mixing symbolic and nonsymbolic inputs early in development. It is intriguing that the SNS ¡ AMS view outlined earlier would predict a reduction in the cost of mixedformat comparisons over development. This is because one's understanding of approximate, nonverbal quantities is increasingly informed by one's knowledge of and experience with exact symbolic representations of numbers (Mix, 2008;Mix et al., 1996;Shusterman, Slusser, Halberda, & Odic, 2016), thereby increasing the link between the AMS and the SNS over developmental time.
Here it is also important to note that the AMS ¡ SNS view and SNS ¡ AMS view differ in terms of what is more likely to lead to improvement in the ability to translate between symbolic and nonsymbolic quantities. From the former (AMS ¡ SNS) perspective, an individual with high AMS acuity should have less trouble affixing approximate, nonsymbolic quantities to exact symbolic representations: Early AMS skills should predict growth in symbolic-nonsymbolic mixing skills. From the latter (SNS ¡ AMS) perspective, nonsymbolic quantities should increasingly be understood in terms of their symbolic counterparts: Early SNS skills should predict growth in symbolic-nonsymbolic mixing skills. Hence, it is crucial to assess the degree to which either early AMS or early SMS (or both) skills predict developmental change in how well children translate between the two systems.

Current Study
In the current study, we empirically probe these two accounts of early symbolic number development-along with the hypotheses each raises-in a single, large, longitudinal study focusing on over 500 children over the course of kindergarten (i.e., in the fall and then the spring). Hence, we focused on the early stages of formal education to address the broader questions (a) To what extent are the AMS and SNS linked at the outset of formal education? and (b) Does the AMS shape the SNS, or the other way around? In sum, here we provide the most comprehensive directional test to date of the early relation between the AMS and the SNS.
We examined this question by testing several complementary hypotheses (summarized in Table 1). 1 First, we examined the cost of mixing between formats in a manner similar to that in Lyons et al. (2012) by computing the degree to which performance on a mixed-comparison task was worse than that on dot-and numeralcomparison tasks (in particular, the critical difference is between mixed comparison and whichever shows worse performance-dots or numerals). Specifically, because the AMS is thought to form the foundation of the SNS, the AMS ¡ SNS view predicts a minimal early (in the fall) cost of mixing formats (Hypothesis 1a; see Table 1 for a summary of hypotheses) that either increases or stays constant over time (from fall to spring; Hypothesis 2a). The SNS ¡ AMS view assumes the two systems are initially distinct and so predicts a large early (in the fall) mixing cost (Hypothesis 1b) that, due to reinterpretation of the AMS in terms of the SNS, lessens over time (from fall to spring; Hypothesis 2b).
Next, we examined whether early (fall) AMS or SNS ability is a better predictor of growth (change from fall to spring) in the ability to translate between the two systems. Under the assumption that the AMS provides the critical foundation for the culturally acquired SNS, the AMS ¡ SNS view predicts that dot comparison in the fall will be a better predictor of mixed-comparison growth (Hypothesis 3a); the SNS ¡ AMS view predicts that numeral comparison in the fall will be a better predictor of mixedcomparison growth (Hypothesis 3b).
Finally, we examined whether early (fall) AMS ability predicts growth in SNS ability (change from fall to spring), the other way around, or both. The AMS ¡ SNS view holds that dot comparison in the fall will be a better predictor of numeral-comparison growth (Hypothesis 4a); the SNS ¡ AMS view holds that numeral comparison in the fall will be a better predictor of dot-comparison growth (Hypothesis 4b). The relation could be bidirectional (Hypothesis 4c, as suggested by Feigenson et al., 2013;Mussolin et al., 2016). Note also that, should results from testing the preceding hypotheses provide evidence that the AMS and the SNS are more likely to be distinct systems even in kindergarteners, it is still important to test Hypothesis 4. Two distinct systems can still influence one another; thus, one could posit a modified version of the AMS ¡ SNS view in which the AMS influences SNS development without being part of the same underlying system.

Participants
Participants were 613 children in Senior Kindergarten kindergarten 2 from 36 schools in the greater Toronto area (all schools are part of the Toronto District School Board, TDSB, whose students comprise one of the largest and most diverse school districts in Canada). Of these, 539 children completed all three critical number-comparison tasks at both time points; subsequent analyses proceeded with N ϭ 539 (241 female). Mean age at the time of the first testing session was 5.17 years (SD ϭ .29, range ϭ ϭ 4.67-5.77SD ϭ). Sixty-five children were not born in Canada. Socio-1 Our measure of AMS ability was a standard dot-comparison task; SNS ability was measured via a numeral comparison task; and the ability to translate between the two systems was measured via a mixed-format comparison task.
2 Note that in several Canadian provinces, Kindergarten is divided into 'Junior' and 'Senior' Kindergarten. The former is akin to what is often referred to elsewhere as 'preschool'; this typically takes place when children are about 4 years old and is often relatively informal in overall structure. Senior Kindergarten is perhaps more closely related to what is referred to as Kindergarten in other areas. Instruction at this phase, while not as strictly structured as first grade, nevertheless begins to emphasize many basic formal concepts in mathematics and other areas. This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Procedure
All behavioral data were collected using paper and pencil Numeracy Screener booklets (complete versions of the screeners can be found in the online supplemental materials). Data were collected at two time points: fall of 2014 and spring of 2015. 3 The average interval between time points was 191.84 days (SD ϭ 14.30, range ϭ 141-217). The same version of the booklets was used at each time point.
Data were collected in collaboration with teachers, early childhood educators (ECEs), and administrators in TDSB schools. Testing materials were approved by the University of Western Ontario's Non-Medical Research Ethics Board. The data reported here are part of a larger joint research project between the TDSB and the University of Western Ontario (UWO), which was been approved by the TDSB's External Research Review Committee (ERRC). The TDSB's research department was authorized by the board to collect student personal information and assessment data for the purposes of the board's educational planning. For this joint research project, parents of participating students were informed by their respective schools that the assessment data were to be collected by the classroom educators and that the confidential student-level data collected were to be kept strictly within the TDSB's Research and Information Services. Only depersonalized data without any school or student identifiers were shared with external partners (including researchers at UWO).
All data collection was completed by teachers and early childhood educators from the classrooms where testing took place. Testing was conducted on a one-on-one basis with each student in a separate, quiet area, requiring approximately 15-20 min per student. Teachers and ECEs were provided with an in-service work day during which they were given explicit training on administering test booklets. Written instructions were also provided for each task in the booklet.
For each task, teachers went over a predefined set of instructions with the child. These instructions were printed in the booklet at the start of each task, along with other guidelines for administering the task. The teacher went over several examples with the child. They then explained to the child, You should try to complete as many problems as you can. You have two minutes. Work as fast as you can without making too many mistakes. If you make a mistake, draw an X through the mistake and put a new line through the right answer.
Teachers then demonstrated how to correct a mistake. A corrected answer was counted as correct. Children were also instructed to complete the items for a given task in the order they were presented, without skipping items ("Make sure you don't skip any items"). Once the child was ready, the teacher started the timer, and the child turned the page to begin.

Comparison Tasks
Task booklets were based loosely on the design originally innovated by Nosworthy, Bugden, Archibald, Evans, and Ansari (2013). Booklets contained six numeral tasks, though only three of these-the three numeral-comparison tasks, described in detail in the next sections-are of direct theoretical relevance to the questions and hypotheses outlined earlier. Children completed three comparison tasks: numeral comparison, dot comparison, and mixed comparison-always in that order. Each comparison task comprised 72 total items, with 12 items per page. Children were given 2 min to complete as many items as possible per task.
Numeral comparison (NC). Examples of the numeralcomparison task are shown in Figure 1a. Children were told, "In this task, your job is to decide which of the two numbers is bigger. Draw a line through the box with the number that means the most things." Numerals ranged from 1 to 9, with absolute numerical This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
distances |n 1n 2 | of 1 to 3 and ratios (minimum-maximum) from .250 to .889. Specifically, all 15 combinations of 1-9 with distances of 1 or 2 were included, along with three combinations with distance 3 ({1,4}, {3,6}, {6,9}). This yielded 18 possible combinations. Of these 18, nine were permuted such that the larger number was on the left, and the other nine were permuted such that the larger number was on the right. The nine trials were chosen such that the larger side was in no way related to numerical size, distance, or ratio. The next 18 trials were arranged in the opposite manner. The last 36 trials were determined in the same manner. Trial order was then pseudorandomized within each set of 18 trials such that, for any nth item in the sequence, average numerical ratio, size, and distance were equated across comparison tasks (numeral, dot, mixed). This final step ensured that if, for instance, a given child completed exactly 10 trials on each of the three comparisons' tasks, the ratios (or sizes or distances) encountered on each task would not have differed significantly across tasks (all ps Ͼ .20). In other words, comparing performance across tasks was not confounded with these numerical factors.

Dot comparison (DC).
Examples of the dot-comparison task are shown in Figure 1b. Children were told, "In this task, your job is to decide which of two boxes contains more dots. Draw a line through the box that has the most dots in it." Children were also instructed, "Don't try to count the dots. Instead, just look at the dots and try your best to guess which side has more dots in it." Numerosities and trial order were determined in the same manner as in the numeral-comparison task just described. In addition, two versions of a given permutation were created. In one version, dot area was positively correlated with numerosity, and overall contour length was negatively correlated with numerosity; in the other version, the opposite was true. On a given trial, the two parameters were thus in opposition; between trials, relying on any single parameter would have led to chance performance (Gebuis & Reynvoet, 2012). Parameter version order was further pseudorandomized such that it was not informative of the correct answer within a given segment of trials.

Mixed comparison (MC).
Examples of the mixedcomparison task are shown in Figure 1c. Children were told, In this task, your job is to decide whether a number or a group of dots means more things. If the number means more things, draw a line through the number. If the dots mean more than the number, then draw a line through the dots.
As with dot comparisons, children were also instructed, "Don't try to count the dots. Instead, just look at the dots and try your best to guess which side means more." Numerosities and trial order were determined in the same manner as in the numeral-comparison task described earlier. In addition, which side contained the numeral and which side the dots was pseudorandomized such that it was not informative of the correct answer within a given segment of trials.
Scoring. Raw scores were computed as the net number of items correctly completed within the 2-min time limit. In a timed task such as this, it is crucial to adjust for guessing; randomly guessing on all 72 items would yield, on average, a raw score of 36. Hence, scores were adjusted for guessing using this standard adjustment: where A is the adjusted score, C is the number correct, I is the number incorrect, and P is the number of response options (Rowley & Traub, 1977). This method has the effect of adjusting a guessing strategy, on average, to 0. For instance, in a four-item multiple-choice exam (where each choice is equally probable), randomly guessing on 20 items would yield, on average, an unadjusted score (C in the equation) of 5. The adjusted score (A) would be In the current case, each item had two alternatives (left and right quantity, equally likely to be correct), so adjusted scores were effectively correct minus incorrect (A ϭ C -I). That said, results were highly similar regardless of whether adjusted or raw scores were used, indicating that the presence of guessing strategies did not substantially influence the results. Mean adjusted scores are given in Figure 2.

Covariates
For all regression analyses, the following variables were included as control variables: age (years), sex, whether a child was born in Canada (0 or 1), school SES, percentage of days absent during the kindergarten school year (M ϭ 8.9%, SD ϭ 7.2, range ϭ 0 -50.3), 4 and testing interval (in days). 5 This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
In addition, we controlled for children who may not have understood a given task. Rather than exclude a child altogether for failing to perform above chance (adjusted score Ͼ0) on a single task, we instead created a dummy variable for each task, coded as 1 if a child's adjusted score Յ0, or 0 if a child's adjusted score Ͼ0. In any regression model that included a given task (as either an independent or a dependent variable), the corresponding dummy variable for that task was also included as a covariate of no interest. Note that these variables were necessarily (anti-) correlated with the variables of interest: actual performance (adjusted or otherwise). This makes this approach relatively conservative, indicating that we have perhaps slightly underestimated effect sizes (i.e., partial correlations). It is important to note, therefore, that all nonsignificant (p Ն .05) results remained so, whether or not these dummy variables were included.

Predicting Growth
To test Hypotheses 3 and 4 (regarding the direction of influence between the AMS and SNS), we examined which tasks at Time 1 uniquely predicted growth in which of the other tasks. As recommended by Castles and Coltheart (2004; see also a similar discussion in Sasanguie, Defever, Maertens, & Reynvoet, 2014), growth was assessed by predicting spring (Time 2) scores after controlling for fall (Time 1) scores. This is an estimate of growth because it removes the Time 1 variance from both the predictor and the outcome, meaning that any residual relation between the predictor and outcome is, by definition, specific to what is unique to the outcome at Time 2. 6 Predicting Time 2 while controlling for Time 1 is also preferable because the within-task relation across time (e.g., DC Time 1 ϳ DC Time 2) is interpretable. However, such relations (i.e., the circular arrows in Figure 3) are not, strictly speaking, predictions of growth, though, of importance, they are unique to that task, because the other tasks (e.g., NC and MC at Time 1) are of course included in the model. In this way, consistent with Mussolin et al. (2014), we computed three regression models, predicting growth in each of the three comparison tasks. For example, to predict numeral-comparison growth, we adjusted the outcome scores on spring (Time 2) numeral comparison, and the main predictors of interest were fall (Time 1) numeral, dot, and mixed adjusted scores (along with all the relevant covariates, as noted earlier).

Reliability
Due to time constraints, we used timed tests for the three comparison tasks; interitem reliability (i.e., Cronbach's alpha) could not be computed over all 72 possible items in each task because not all children completed all 72 items (indeed, few did). Instead, we identified the maximum number of trials that at least two thirds of participants completed across all three tasks (we used Time 1 data here, because that was the first time children encountered all tasks). The first 20 trials for dot-, numeral-, and mixedcomparison tasks were completed by 70.3%, 68.8%, and 67.7% of participants, respectively. Limiting reliability estimates to just these participants and just the first 20 trials, we found reasonable to good reliability for all three tasks (dot comparison: ␣ ϭ .70, numeral comparison: ␣ ϭ .83, mixed comparison: ␣ ϭ .69; results were similar if other arbitrary thresholds were adopted).
Though reliabilities were fairly comparable to one another, that for numeral comparison was a bit higher than for the other two tasks, which may have inflated regression results for this task. Indeed, differences in measurement reliability are known to pose a major potential confound for cross-lagged panel longitudinal models, such as the type we used here (Hamaker, Kuiper, & Grasman, 2015;Rogosa, 1980). To address this concern, we recomputed the partial correlations in Figure 3, but only after disattenuating the relation between each pair of critical variables for the reliabilities of the relevant variables (e.g., Murphy & Davidshofer, 2004). Our central conclusions from testing Hypotheses 3 and 4 remained unchanged (see Appendix C for complete results), though it is important to acknowledge that, if some alternative conception of the comparison measures' reliabilities were devised in the future, results might change.

Manipulation Check-Ratio Effects
An important assumption of the AMS view is that nonsymbolic magnitude processing reliably elicits ratio effects (as the ratio between the two quantities being compared approaches 1, performance diminishes). It was thus important to check that the dotcomparison task in particular reliably elicited ratio effects. We assessed this at Time 1 (when children were least familiar with the tasks) by examining whether accuracy decreased as ratio increased (ratio was computed here as minimum-maximum). For simplicity, 6 Note that this method is preferred over using change scores (e.g., Time 1-Time 2) for several reasons. First, change scores contain variance from both Time 1 and Time 2. If one found a correlation, for example, between numerals (NC) at Time 1 and dot (DC) change scores, it would not be clear whether this was due to DC variance at Time 1 or at Time 2 (or both). One can include DC scores at Time 1 as a covariate to remove this variance; however, this yields results identical to just predicting Time 2 after controlling for Time 1 (as we have done here). Second, a given score at Time 1 will necessarily be correlated with a change score based on that same variable (e.g., DC at Time 1 predicting DC change scores), making such results largely uninterpretable. This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
we binned trials into far (ratio Յ . It is also important to see that ratio reliably predicted accuracy on an individual child level (Lyons, Nuerk, & Ansari, 2015). For each child, we computed the correlation (r value) between ratio and accuracy, where a ratio effect would be indicated by a strong negative correlation. Average r values for the three tasks were as follows: DC: mean r ϭ Ϫ.735 (SE ϭ .016), MC: mean r ϭ Ϫ.635 (SE ϭ .021), NC: mean r ϭ Ϫ.366 (SE ϭ .024). In sum, the dot-(and mixed-)comparison task reliably elicited ratio effects at the population and individual levels, indicating it was indeed measuring the AMS. Though the numeral-comparison task showed a ratio effect when averaging across all children, this was much less reliable for individual subjects (consistent with what was seen among children in Grades 1-6; Lyons, Nuerk, & Ansari, 2015).

Results
For all tests, effect sizes and exact p values are given for additional context; the significance threshold was p Ͻ .05. Raw data can be found here: https://osf.io/uf2gb/.

Developmental Changes and Task Differences
To test Hypotheses 1a and 1b and Hypotheses 2a and 2b (whether there is a cost of mixing formats at the outset of kindergarten, and how this changes over the course of the school year), we examined longitudinal changes in performance from fall to spring for the three numeral-comparison tasks. The AMS ¡ SNS view predicts that mixed-comparison performance will be no worse than dot-comparison performance in the fall (Hypothesis 1a); this difference should either remain constant or increase in the spring (Hypothesis 2a). Hypothesis 2 predicts that this difference should be significantly reduced at Time 2 (spring). The SNS ¡ AMS view predicts that mixed-comparison performance will be worse than dot-comparison performance in the fall (Hypothesis 1a); this difference should either remain constant or increase in the This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
spring (Hypothesis 2a). Hypothesis 2 predicts that this difference should be significantly reduced at Time 2 (spring). Data were first entered into a 3 (task: numeral, dot, mixed) ϫ 2 (time: fall, spring) within-subject analysis of variance (results are summarized in Figure 2). The main effect of task was significant, F(2, 1076) ϭ 335.33, p ϭ 6E-114, d ϭ 1.58. Performance was overall best on numeral comparisons, followed by dot comparisons, and performance was worst on mixed comparisons. Numeral-comparison scores were significantly higher than were dot-comparison scores at both time points: fall: t(538) ϭ 8.62, p ϭ 7E-17, d ϭ .74; spring: t(538) ϭ 19.53, p ϭ 1E-64, d ϭ 1.68. Dot-comparison scores were significantly higher than were mixedcomparison scores at both time points: fall: t(538) ϭ 7.10, p ϭ 4E-12, d ϭ .61; spring: t(538) ϭ 4.04, p ϭ 6E-05, d ϭ .35. Note that this latter result indicates that comparing between formats was more difficult than comparing within either symbolic or nonsymbolic formats. This result is thus more consistent with the view that symbolic and nonsymbolic representations of number form dissociable representation systems at the very outset of formal schooling, which is consistent with Hypothesis 1b but not 1a. In addition, the difference between dot and mixed comparison was significantly less at Time 2: F(1, 538) ϭ 6.41, p ϭ .012, d ϭ .22, which is consistent with Hypothesis 2b but not 2a. In sum, tests of both Hypotheses 1 and 2 supported the SNS ¡ AMS view: There was a cost of mixing symbolic and nonsymbolic formats even at the outset of kindergarten, and this cost decreased over the course of the school year.
With respect to characterizing change from Time 1 to Time 2 as "growth," it is crucial to note that the main effect of time was significant, F(1, 1076)  It is also important to note that the Task ϫ Time interaction was significant, F(2, 1076) ϭ 5.07, p ϭ 2E-21, d ϭ .61. Longitudinal improvement was significantly 7 greater for NC than either DC, t(538) ϭ 9.09, p ϭ 2E-18, d ϭ .78, or MC, t(538) ϭ 6.91, p ϭ 1E-11, d ϭ .60, and it was significantly greater for MC relative to DC, t(538) ϭ 2.53, p ϭ .012, d ϭ .22 (equivalent to the F test performed to test Hypothesis 2 earlier). These differences in longitudinal improvement are notable in several respects. First, they indicate differential developmental change with respect to symbolic, nonsymbolic, and mixed number processing. Second, they argue against the notion that task improvements may have been driven simply by familiarity with the tasks (which would predict merely a main effect of time).

Predicting Longitudinal Growth
In this section, we tested Hypotheses 3 and 4 (what the direction of influence between the AMS and the SNS is). According to the AMS ¡ SNS view, dot comparison at Time 1 should be a strong unique predictor of both mixed-comparison growth (Hypothesis 3a) and numeral-comparison growth (4a). According to the SNS ¡ AMS view, numeral comparison at Time 1 should be a strong unique predictor of both mixed-comparison growth (Hypothesis 3b) and dot-comparison growth (4b). Finally, it is of course possible to find a bidirectional influence on growth (Hypothesis 4c).
To test these hypotheses, we assessed the degree to which symbolic, nonsymbolic, and mixed numeral processing at the outset of formal education (i.e., at the beginning of kindergarten) uniquely predicted growth in one another over the course of the school year. Unique contributions were assessed via multiple regression, controlling not just for competing variables of interest but also for several covariates of no interest: age, sex, testing interval, absentee rates, school SES, birth location (in Canada or not), and chance performance (see the Method section). Growth was assessed by predicting spring (Time 2) scores after controlling for fall (Time 1) scores.
Model results are visualized in Figure 3 (zero-order correlations between comparison scores are given in Table 2). Arrows in Figure  3 indicate growth directionality. For instance, the arrow pointing from DC to NC in Figure 3 denotes the degree to which dot comparisons at Time 1 predict growth in numeral comparisons; the arrow pointing the opposite direction denotes the degree to which numeral comparisons at Time 1 predict growth in dot comparisons (partial r and p values are also given; full model results can be found in Appendix A).
From Figure 3, it is clear that early (Time 1) numeralcomparison performance was a strong unique predictor of growth in the other two tasks, dot comparison did not significantly predict growth in either of the other two tasks, and mixed comparison was a moderate predictor of growth in the other tasks. Time 1 numeral comparison was a significantly 8 stronger unique predictor of growth in mixed comparison than Time 1 dot comparison (.317 vs. .023; z ϭ 4.95, p ϭ 7E-07), which is consistent with Hypothesis 3b but not 3a. Time 1 numeral comparison was also a significantly stronger unique predictor of dot-comparison growth than Time 1 mixed comparison (partial rs ϭ .284 vs. .108; z ϭ 2.99, p ϭ .003).
From Figure 3, it can be seen that symbolic skills are the primary unique predictor of growth in all three tasks. Indeed, Time 1 numeral comparison was in fact a stronger unique predictor of Time 2 mixed comparison and dot comparison than either of those tasks at Time 1 were of themselves (numeral vs. mixed comparison: .317 vs. .198; z ϭ 2.07, p ϭ .038; numeral vs. dot comparison: .284 vs. .147; z ϭ 2.34, p ϭ .019). Notably, the reverse was not 7 Here, this is reported as t tests between change scores (spring-fall), which is equivalent to the relevant 2 ϫ 2 interaction term. 8 This was computed by comparing partial correlations using Fisher's z tests. Fisher tests were chosen because, within a given model, partial rs are by definition independent (e.g., the partial relation between x 1 and y controls for x 2 , and vice versa). Moreover, Time 1 numeral comparison asymmetrically predicted growth in the other two comparison tasks. Time 1 numeral comparison was a stronger unique predictor of dot-comparison growth than was Time 1 dot comparison of numeral-comparison growth (partial rs ϭ .284 vs. .035; z ϭ 4.16, p ϭ 3E-05), which is consistent with Hypothesis 4b and not 4a (and hence inconsistent with 4c as well). Time 1 numeral comparison was also a stronger unique predictor of mixed-comparison growth than was the reverse (.317 vs. .130; z ϭ 3.21, p ϭ .001). Time 1 mixed comparison was a stronger unique predictor of dot-comparison growth than was the reverse, albeit nonsignificantly so (.107 vs. .023; z ϭ 1.37, p ϭ .169).
In sum, symbolic number processing at the beginning of kindergarten was a stronger predictor of growth in nonsymbolic and mixed-format processing than was the other way around. Indeed, symbolic number scores at the beginning of the year were a stronger unique predictor of nonsymbolic and mixed scores at the end of the year than were even nonsymbolic and mixed scores at the beginning of the year, respectively. Results support Hypotheses 3b and 4b and are thus overall more consistent with the SNS ¡ AMS view.

Discussion
There is currently considerable discussion about the relative influences of evolutionary and cultural factors in the development of early numerical skills. One part of this debate centers around the relationships between approximate, nonverbal (AMS) and exact, symbolic (SNS) representations of number. Here we examined several hypotheses concerning whether, in the earliest stages of formal education, AMS abilities predict growth in SNS abilities, or the other way around. Moreover, we did so in a manner that takes into account the need to develop an ability to translate between the AMS and the SNS (i.e., mixed comparison). Our data derived from 539 kindergarten children, with AMS, SNS, and mixed-comparison skills assessed at the beginning and end of the academic year. In this way, we provide, to the best of our knowledge, the most comprehensive test to date of the direction of influence between the AMS and the SNS in early formal schooling. For all four hypotheses tested (1-4), results clearly favored the view that SNS abilities at the beginning of kindergarten lay the foundation for improvement in both AMS abilities and the ability to translate between the two systems (results were consistent with Hypotheses 1b-4b and not 1a-4a). Specifically, there was a significant cost of mixing formats present even at the outset of kindergarten, indicating an early dissociation between symbolic and nonsymbolic number systems (Hypothesis 1b). This mixing cost reduced over the course of the school year, indicating an increasing capacity to translate between the two systems (Hypothesis 2b). Growth in this format-mixing was predicted by symbolic but not nonsymbolic ability (Hypothesis 3b). And indeed, SNS ability predicted growth in AMS ability over the course of the year but not the other way around (Hypothesis 4b). More broadly, we conclude that, once one acquires a basic grasp of exact number symbols, it is this exact understanding of number (and repeated practice therewith) that in fact predicts growth in the AMS. We speculate that one's understanding of approximate, nonverbal quantities is increasingly informed by one's knowledge of and experience with exact symbolic representations of numbers, though the precise mechanism by which this may occur remains unknown. Candidate interpretations and mechanisms are discussed next.

Evidence for the SNS ¡ AMS View
To identify the direction of influence between the AMS and the SNS in early education, we examined four hypotheses. First, we examined whether, at the outset of formal education, the AMS and the SNS should be considered separate systems to begin with. We found that, at the beginning of kindergarten (Time 1), scores on the mixed-comparison task were lower than those on either the numeral-or dot-comparison task. This suggests there is an additional cost to translating between the two formats. This result is broadly consistent with those found by Mundy and Gilmore (2009) and Brankaer et al. (2014). Mundy and Gilmore found in their second experiment that children roughly 7-8 years of age struggled with a mapping task (children judged which of two quantities matched the numerosity of a target, where the target was in a different format-symbolic or nonsymbolic-from that of the two options): Average accuracy was 62% (chance ϭ 50%). The same children performed well on standard symbolic-symbolic (90.2% accuracy) and nonsymbolic-nonsymbolic (87.3%) comparison tasks. Because different tasks (matching vs. comparison) were used, it is difficult to directly compare performance on the sameformat task with the mixed-format task (as by contrast we were able to do here-note also that a direct comparison between sameand mixed-format tasks was not a stated goal of Mundy & Gilmore, 2009). Nevertheless, it seems that children found the cross-format matching task relatively difficult. More recently, Brankaer et al. found a highly similar result using similar matching and comparison tasks in children approximately 7 and 9 years old.
Taken together with the current results (see especially Figure 2 and tests of Hypotheses 1 and 2 regarding the cost of mixing formats), it seems that the ability to translate between symbolic and nonsymbolic numerical formats is nontrivial, even in children as young as those at the start of kindergarten. This suggests that children's understanding of number symbols may emerge independently of the AMS (for a similar proposal, see, e.g., Carey, 2011;Le Corre & Carey, 2007). At minimum, our results suggest that the SNS is only indirectly linked to the AMS early in development, which is difficult to reconcile with the view that approximate representations of quantity serve as the semantic building blocks of their symbolic counterparts (e.g., Dehaene, 2008;Feigenson et al., 2004Feigenson et al., , 2013Piazza, 2010;Piazza et al., 2007;Verguts & Fias, 2004). Furthermore, this result is in direct contrast to the prediction made by Lyons et al. (2012) that the dissociation between symbolic and nonsymbolic numerical representations emerges only later in development with increasing exposure to formal instruction with number symbols.
Indeed, we also found that the cost of translating between the AMS and the SNS lessened by the end of kindergarten. This surprising result is in fact broadly consistent with the notion that people's understanding of nonsymbolic quantities is increasingly shaped by (and thus tied to) their understanding of symbolic quantities (and is also consistent with results found by Mix, 2008). This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
Of course, one could alternatively interpret the reduced mixing cost at Time 2 by positing the reverse: Although perhaps initially distinct, the AMS nevertheless plays a role in shaping the SNS once formal schooling begins. Thus, it is crucial to examine competing longitudinal predictions of growth in the ability to translate between the SNS and the AMS: Is numeral comparison or dot comparison at the beginning of kindergarten a better unique predictor of growth in mixed comparison over the course of the school year?
The longitudinal results shown in Figure 3 clearly support the view that it is the numeral comparison that uniquely predicts growth in mixed comparison and not dot comparison (hence consistent with Hypothesis 3b; see also Figure 3; with partial rs of .317 vs. .023, respectively). Here again, results clearly support the view that early SNS skills predict growth in one's ability to translate between the SNS and the AMS. If children's understanding of AMS-based quantities is increasingly in terms of their grasp of SNS-based quantities, this is exactly what one would expect. We found indirect evidence that, consistent with this view, children were more likely to convert dot arrays to symbolic form than numerals to approximate magnitudes to complete the mixedcomparison task. Systematic underestimation when converting a dot array into a symbolic representation (via noncounting estimation; ) is expected and has been shown in adults to lead to poorer performance in a mixed-comparison task when the dot array is numerically larger than the numeral relative to the opposite (Lyons et al., 2012). This is exactly what we found here: Children were less accurate on mixed-comparison trials where the dot array was numerically greater than the numeral (dots Ͼ numeral: 68.9% correct, SE ϭ 1.2; numeral Ͼ dots: 80.5% correct, SE ϭ 1.0; p Ͻ .001). Even in kindergarten, children seem more inclined to process magnitudes in symbolic form, consistent with the notion that their understanding of the AMS may be increasingly influenced by their grasp of number symbols. It is also worth noting that the ability to map between symbolic and nonsymbolic quantities is a unique predictor of more complex math skills in 6-to 8-year-olds-a result that holds over and above the variance captured by more standard symbolic and nonsymbolic tasks (Mundy & Gilmore, 2009; see also Brankaer et al., 2014). Hence, the ability to map between the AMS and the SNS is both nontrivial and potentially key to the development of math abilities more generally. However, contrary to the currently dominant view, our results suggest that this mapping ability may be facilitated primarily by children's increasing proficiency with numerical symbols and not approximate magnitudes.
The final hypothesis we tested concerns whether AMS skills at the beginning of the year predict growth in SNS skills over the course of the year (AMS ¡ SNS; Hypothesis 4a), vice versa (SNS ¡ AMS; Hypothesis 4b), or both (AMS ↔ SNS; Hypothesis 4c). Even if the AMS and the SNS are distinct representational systems, it is possible for the two systems to influence one other. It is thus important to know whether this is the case and, if so, begin to accrue evidence regarding the direction of influence. Again, our results clearly supported the SNS ¡ AMS view (Hypothesis 4b).
Numeral-comparison scores at the beginning of kindergarten were a stronger unique predictor of growth in dot-comparison scores over the course of the year than was the other way around (from Figure 3; partial rs ϭ .284 vs. .035, respectively). Indeed, numeral comparison (Time 1) was a better unique predictor of dot com-parison than dot comparison was of itself over the course of the year (from Figure 3; partial rs ϭ .284 vs. .147, respectively). This indicates the SNS provides a stronger influence on the development (at least over the course of kindergarten) of the AMS than does the other way around.
It is important to note that this result is not without precedent. Indeed, this result may in some ways be seen as a replication and extension of two previous studies. In preschoolers, Mussolin et al. (2014) showed that symbolic comparison predicted growth in nonsymbolic comparison but not the other way around. Matejko and Ansari (2016) recently showed a similar result in first graders. It is worth noting that the current sample included roughly an order of magnitude more participants (539 vs. 57 in Mussolin et al., 2014, and30 in Matejko &. Moreover, we also controlled for children's ability to map between the two formats (mixed-comparison performance), and we tested several related hypotheses concerning the ability to map between formats as well. Furthermore, in kindergarteners, Sasanguie, et al. (2014) showed no significant correlation between nonsymbolic comparison performance and symbolic comparison performance 6 months later (the opposite relation could not be assessed, because the authors did not collect symbolic comparison scores at the first time point). Crucially, these studies all clearly converge on the conclusion that it is early symbolic numerical skills (part of the SNS) that facilitate improvement in early nonsymbolic magnitude skills (part of the AMS) and not the other way around. This conclusion is further strengthened by the results testing Hypotheses 1-3 (discussed previously) indicating that one's ability to translate between the SNS and the AMS is both nontrivial and likely facilitated by the fact that approximate magnitudes come to be understood primarily in terms of their symbolic counterparts.

Potential Mechanisms
In the previous section, we reviewed evidence-from both the current study and previous work-that supports the notion that the SNS emerges independently from the AMS and that it is the former that primarily predicts growth in the latter (and not the other way around). Although the evidence for the SNS ¡ AMS view outweighs that of the AMS ¡ SNS view, the mechanism(s) by which the SNS may influence the AMS remain largely underspecified. In this section we outline two potential mechanisms.
One possibility suggested by Piazza et al. (2013) as well as Mussolin et al. (2016) is that the SNS directly shapes the AMS by improving the representational precision of the latter. Symbols, which tend to be more precise than are their analog counterparts, can help to "sharpen" the nonsymbolic analog magnitude (AMS) representations. These sharper representations then permit more precise discrimination between nonsymbolic quantities, which in turn predicts improved performance on something like a dotcomparison task. On the other hand, Piazza et al. and Mussolin et al. still posited a direct linkage between symbolic and nonsymbolic representations-that is, they are underpinned by the same kind of representation-and indeed the authors suggested a bidirectional influence between the SNS and the AMS. The unidirectionality of longitudinal results here (in particular the lack of evidence supporting the AMS ¡ SNS direction; see Figure 3) speak against this assumption, as do our results showing a cost of mixing formats even at the outset of kindergarten (see Figure 2). That said, our This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
results cannot speak directly to children younger than kindergarteners, for whom some sort of link may have been established. An alternative perspective is to assume that the SNS and the AMS are completely distinct systems from the outset of development. This would be more consistent with the mixing costs seen here (see Figure 2), the work reviewed earlier by Mundy andGilmore (2009) andBrankaer et al. (2014), and a growing body of neuroimaging studies showing a distinction between symbolic and nonsymbolic numerical representations, at least in adults (Bulthé et al., 2014(Bulthé et al., , 2015Damarla et al., 2016;Damarla & Just, 2013;Lyons, Ansari, & Beilock, 2015). According to this view, interrelations among the comparison tasks are due primarily to neartransfer of learning to do the task. The large directional asymmetry is thus simply a function of having more room for growth in the purely symbolic (NC) case. With formal education, children would be expected to improve rapidly on the symbolic comparison task, which can in some ways be seen as having the highest "ceiling" because symbols can be infinitely precise and allow for a rich set of conceptual associations. Indeed, the steepest growth curve in the current data set was seen for the symbolic (NC) task (see Figure 2). According to this view, children with strong symbolic skills are expected to improve most dramatically on the NC task (consistent with the large positive correlation between Time 1 and Time 2 NC), and it is this improvement in learning to perform one type of numerical comparison task that transfers to the others. Unlike the interpretation outlined in the paragraph previous, this view does not require any direct relation between the SNS and the AMS, because mapping between symbolic and analog representations is not required for the SNS to develop. Instead, the SNS develops independently from the AMS; part of this development permits improved symbolic comparison performance, which transfers at the task level to the mixed-and dot-comparison tasks (for a similar suggestion regarding how acquisition of number-word meanings might influence the AMS, see Shusterman et al., 2016).
Castles and Coltheart (2004) made a related argument in the discussion of the literature on the predictors of early reading skills. This argument rests primarily on the notion that a given task does not equal a given process (or representation). Showing that performance on a symbolic number comparison task predicts improvement on a nonsymbolic comparison task does not necessarily imply that SNS representations are fundamentally shaping AMS representations; instead, it may mean that learning how to perform the symbolic task changes how one does the nonsymbolic version of the task. In addressing the surprising finding that reading ability predicts improvement in scores on a phonological awareness task, Castles and Coltheart wrote, "The acquisition of reading skills does not actually change the level or nature of phonological awareness itself. Rather, it influences the way in which children perform phonological awareness tasks" (p. 80). In the current case, learning to process numbers symbolically may change how one does tasks that involve nonsymbolic magnitudes as stimuli. That said, precisely how this change occurs and what it may or may not imply for approximate magnitude representation remains a topic for further research.
Nevertheless, we do not believe our results are merely a fluke of-that is, limited to-number comparison tasks, because related results can be found elsewhere in the numerical literature. For instance, children who did not yet have the cardinality principle were completely at chance on a nonsymbolic comparison task (Negen & Sarnecka, 2015). The notion here is that only when children have an understanding of the symbolic cardinal label of a set do they understand how to perform a nonsymbolic comparison task. In addition, Mix (2008) tested 3-year-olds' ability to match sets of objects in terms of their cardinality (is one set of objects numerically equal to another set of objects of a different type). Mix found that children with knowledge of the correct symbolic (verbal) cardinal labels of the sets involved (e.g., "three," "four") performed significantly better on a range of nonsymbolic matching tasks. This suggests that children's ability to compare nonsymbolic quantities is linked to their understanding of the cardinal meaning of number words-which are arguably the first number-symbols most children learn (for a review, see, e.g., Mix et al., 2002). Similarly, Shusterman and colleagues (2016) found longitudinal evidence that children's acquisition of cardinal understanding of number words preceded an abrupt improvement in performance on AMS comparison acuity (see also Shusterman et al., 2016, for a detailed discussion of the potential mechanisms by which verbal symbols-number words might impact AMS processing). In sum, these results dovetail with those of the current study to suggest that knowledge of and proficiency with the symbolic number system impacts how one processes approximate, analogue magnitudes. Whether acquisition of a symbolic number system directly impacts approximate magnitude representation or the broader framework-that is, system-by which these magnitudes are (or are not) brought to bear in solving specific tasks remains an interesting avenue for future research.
A point of concern is that processing of nonsymbolic quantities in the subitizing range (1-4) differs from processing of approximate magnitudes outside this range (e.g., Revkin, Piazza, Izard, Cohen, & Dehaene, 2008). In addition, some researchers have suggested that if there is a critical link between symbolic and nonsymbolic representations of quantity, this link is primarily limited to quantities within the subitizing range (e.g., Carey, 2011). This has two implications for the current work. On the one hand, our measure of the AMS may have been "contaminated" by inclusion of trials that use numbers in the subitizing range. In particular, the result that numeral comparison significantly predicted growth in dot comparison may have been inflated by the inclusion of subitizable quantities (a similar argument could be made for the bidirectional relation between numeral and mixed comparisons). On the other hand, the fact that dot comparison did not predict growth in either of the other comparison tasks perhaps speaks against this account. Regardless, we felt it important to test whether the regression results in Figure 3 (and hence our conclusions with respect to Hypotheses 3 and 4) would differ substantially if we limited analyses to only trials comprising stimuli outside the subitizing range (the same limits were imposed on predictors from all three tasks so as not to bias reliability due to the number of trials and ensure that any relations would be exclusive to quantities outside the subitizing range across the board; results are reported in Appendix B). Results show that numeral comparison remained a strong predictor of growth in the other two tasks and that neither dot nor mixed comparison were significant predictors of either one another or numeral comparison. This suggests that the influence of the SNS on the AMS extends beyond the subitizing range, and the asymmetry in the influence between the two systems, may be even stronger for larger numbers. Consistent with the earlier discussion (and with Shusterman et al., 2016), the This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
relation between the SNS and the AMS is more likely to operate via more general aspects of the respective systems, though further work is certainly needed.

Reconciling Current Results With Previous Work
In the introduction, we indicated that considerable attention has been paid to the view that the AMS plays a crucial role as the foundation for and scaffold of the SNS (Dehaene, 1997(Dehaene, , 2008Feigenson et al., 2004Feigenson et al., , 2013Gallistel & Gelman, 2000;Piazza, 2010). Our results are inconsistent with this view; hence, in this section, we briefly examine how our results may potentially be reconciled with previous evidence that has been interpreted in favor of the AMS ¡ SNS view. First, numerous studies have shown that the precision of an individual's AMS is predictive of SNS abilities (for a review and meta-analysis, see Chen & Li, 2014). Nearly all of these studies are cross-sectional in nature, so causal direction cannot be inferred. Moreover, of studies that have examined the relation between AMS and SNS tasks at different time points, few if any include control for the outcome at Time 1 (e.g., Libertus, Feigenson, & Halberda, 2011;Mazzocco, Feigenson, & Halberda, 2011;Wang, Odic, Halberda, & Feigenson, 2016). Inclusion of this control is crucial because it allows one to genuinely predict growth in the outcome. For instance, if one uses performance on a dot-comparison task at Time 1 to predict performance on an arithmetic task at Time 2, the direction of influence is still ambiguous. However, if one controls for arithmetic performance at Time 1, then the remaining variance in arithmetic at Time 2 accounted for by dot comparison at Time 1 can be attributed to only what has changed in the arithmetic task between Times 1 and 2; hence, one can infer the direction of influence. We included this crucial control here, as did Mussolin et al. (2014) and Matejko and Ansari (2016). In all three cases, the direction of influence was found to be from symbolic to nonsymbolic tasks and not the other way around.
Another line of evidence where causality can be more readily inferred comes from training studies showing that nonsymbolic arithmetic training predicts growth in symbolic arithmetic scores (Hyde et al., 2014;Park & Brannon, 2013 and that intensive math training does not seem to improve AMS precision (Sullivan, Frank, & Barner, 2016). This certainly suggests that there are instances where nonsymbolic numerical processing can influence SNS processing, though the results mentioned earlier do not appear to extend to training on nonsymbolic comparison (Park & Brannon, 2014). Regardless, it is certainly possible that approximate arithmetic performance at the outset of kindergarten might predict growth in symbolic number skills. Furthermore, it is important to emphasize that our data did not allow us to examine whether dot comparison performance predicts growth in tasks that tap the AMS more directly than do symbolic or mixed comparison, such as nonsymbolic arithmetic.
Finally, it is worth pointing out that in most cases where tasks measuring the AMS (e.g., dot comparison) show a zero-order correlation with symbolic tasks (e.g., symbolic arithmetic), this relation is often completely accounted for by a comparable symbolic task (e.g., numeral comparison; for empirical results, see, e.g., Göbel, Watson, Lervåg, & Hulme, 2014;Lyons, Price, Vaessen, Blomert, & Ansari, 2014; for a recent meta-analysis and review, see, respectively, Schneider et al., 2017, andMerk-ley &. In other words, symbolic numerical tasks tend to be more strongly related to one another than to nonsymbolic numerical tasks. This is broadly consistent with a strong distinction between the SNS and the AMS, which is also consistent with the results here. Moreover, the longitudinal results in the current study suggest that what relation does exist between symbolic and nonsymbolic numerical tasks is reflective primarily of the SNS bearing influence on the AMS and not the other way around.

Conclusion
In sum, we tested several hypotheses concerning the nature and direction of the relation between approximate, nonverbal (AMS) and exact, symbolic (SNS) representations of number. Results converged to show that the two systems are relatively distinct even at the outset of kindergarten. Results also clearly indicated that the SNS unidirectionally predicts growth in the AMS. This asymmetry-in particular the lack of evidence for an AMS ¡ SNS relation-is in direct contrast to the view that the more evolutionarily ancient AMS underpins the culturally acquired SNS. Instead, it appears that culturally acquired number symbols may influence how kindergarteners process nonsymbolic quantities. The precise mechanism by which this process occurs remains unknown. Number symbols may directly change nonsymbolic representations of magnitude, or number symbols may simply afford greater opportunity for improvement in quantity comparison tasks more generally, which transfers to nonsymbolic versions of the task as well. Future work may help elucidate the precise mechanisms. Regardless, the current results may serve to reorient theories about the progression of numerical development especially in the early stages of formal education.

Appendix A Full Longitudinal Regression Results
Appendix A gives full regression results for the three regression models used to test the longitudinal hypotheses. Note that these models were used to produce the partial r values (and corresponding p values) depicted in Figure 3.

(Appendices continue)
This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Appendix B Full Regression Results (Large Items Only)
Here we test whether inclusion of trials involving smaller quantities may have inflated the predictive capacity of the mixed and numeral comparison tasks and may have deflated the predictive capacity of the dot comparison task. We tested this by including only trials where all quantities were outside the subitizing range (Ͼ5) for all three critical predictors (the three comparison tasks at Time 1). Results in the tables show that they did not. Tables are organized in the same manner as in Appendix A for easy comparison. Slight variations in degrees of freedom are due to a few children not having attempted any qualifying trials.

(Appendices continue)
This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.  This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Appendix C Correcting Longitudinal Regression Results for Variation in Task Reliability
Here we report the critical partial correlations shown in Figure  3 but recomputed after disattenuating relations between the critical variables with respect to their relative reliabilities (see Figure C1). In the main text, we ran three regression analyses-one with each of the three comparison tasks at Time 2 as the dependent variable. The three comparison tasks at Time 1 were always the critical independent variables. We computed the partial correlation matrix between the four critical variables-for example, Time 1 numeral comparison, Time 1 dot comparison, Time 1 mixed comparison, Time 2 numeral comparison-after removing the influence of all covariates. Note that at this stage, the correlations between critical variables were residualized only with respect to the covariates and not yet with respect to one another. These correlation matrices were then disattenuated: r=(xy) ϭ r(xy)/͌(xr ϫ yr), where r(xy) is the correlation between variables x and y, and xr and yr are the reliability estimates for x and y, respectively (e.g., Murphy & Davidshofer, 2004). Reliability values were taken from the main text (see the Reliability section). Because reliability was less than perfect (Ͻ1) for all tasks, this meant that correlations would be expected to increase in all cases; crucially, however, those involving variables with lower reliability (e.g., between DC and MC) would be expected to increase more so than those involving higher reliability. To residualize the critical variables with respect to one another (i.e., to compute unique contributions of each task at Time 1 to growth in the dependent variable (as is depicted in Figure 3), we "reduced" the correlation using a pseudo-inverse procedure that removes all mutual influence (resulting in a matrix of partial correlations akin to what one would get from a multiple regression (Opgen-Rhein & Strimmer, 2007  This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.