In this paper, we are aiming to contribute to the debate concerning integration between speech and gesture. More specifically, our goal is to determine whether the two serve as more or less independent sub-systems of the same processing/production circuit (McNeill 1992, 2005; Kendon 2000 and more recently Schlenker 2018, 2019, 2020 and Ebert 2018), or whether gestures form a distinct system from speech, and are thus related to thoughts rather than directly to words (Abner et al., 2015; Goldin-Meadow & Alibali, 2013; Goldin-Meadow et al., 2001). We present results from two experiments conducted on French speakers aiming at evaluating the processing costs and the relative advantages of bimodal gesture-speech input over speech-only input at the lexical level. The analyses of RTs provide evidence to support an early integration of gesture and speech in lexical tasks.