**Abstract:** Recognising other people relies on differentiating between individuals (“telling apart”), as well as generalising across within-person variability (“telling together”; Burton et al., 2013; Lavan et al., 2019a, 2019b). However, brain areas associated with face and voice recognition have often been identified using tightly controlled experiments, which underspecify the naturalistic variability that occurs during authentic person identification. We used representational similarity analysis to uncover neural representations of identity following naturalistic, task-free stimulation. Analyses were conducted on open-access MRI datasets, in which participants watched feature-length movies (Aliko et al., 2020). Identity representations - defined as similar response patterns to variable instances of the same person, and dissimilar patterns in response to different people - were observed in established face and voice processing areas. We replicated the findings across two independent participant groups in response to different sets of identities. Finally, we explored contributions of face vs voice information to identity representations, finding more widespread preferential sensitivity to faces. Our paradigm thus characterised how the brain represents identities in the real world, for the first time accounting for both “telling people together” and “telling people apart”. The findings complement previous work by showing that similar areas are engaged under task-based and naturalistic exposure.