NB: These data were removed from the final version of the manuscript but are included here for transparency.
The number of PIK3CA-H1047R copies in The Cancer Genome Atlas (TCGA) breast invasive carcinoma (BRCA) samples was retrieved by filtering previously published copy number information from this dataset (Madsen et al., 2019). Samples were classified as having multiple PIK3CA-H1047R copies (“multiple”) when the mutant allele dosage (mut.multi) > 1.5. Samples with a single PIK3CA-H1047R copy and an additional PIK3CA variant were also classified as “multiple”. The associated RNA sequencing and clinical data were retrieved using the R package TCGAbiolinks (version: 2.12.3) according to the accompanying vignette (Colaprico et al., 2016). The raw sequencing counts were converted to cpm using the *edgeR* package in R. A filtering step was applied, requiring > 1 cpm in more than one tumor sample for a gene to be considered expressed. Copy number data, RNAseq and clinical information were merged based on the TCGA sample barcode and filtered according to genes of interest. Candidate gene expression data were plotted either as a function of the number of PIK3CA-H1047R allele or upon additional stratification according to tumor stage.