Main content



Loading wiki pages...

Wiki Version:
This repository contains CHiCAGO input, design and output files, as well as chromatin feature files for the Promoter Capture Hi-C experiments in GM12878 (Mifsud et al, 2015) and mESC (Schoenfelder et al., 2015) cells that were used in the study presenting the CHiCAGO pipieline (Cairns / Freire-Pritchett et al., 2016). **CHiCAGO input files** - GM12878_chinput.tar.gz (three biological replicates from Mifsud et al., Nature Genet 2015) - mESC_chinput.tar.gz (two biological replicates from Schoenfelder et al., Genome Res 2015). These files are based on the Promoter Capture Hi-C sequencing reads from the above publications (ArrayExpress accessions E-MTAB-2323 and E-MTAB-2414, respectively). The publicly available HiCUP pipeline (Wingett et al., F1000Research 2016) was used to process the raw sequencing reads. This pipeline was used to map the read pairs against the mouse (mm9) andhuman (hg19) genomes, to filter experimental artefacts (such as circularized reads and re-ligations), and to remove duplicate reads. The resulting BAM files were processed into CHiCAGO input files, retaining only those read pairs that mapped, at least on one end, to a captured bait. The script, used for this purpose, is available as part of the chicagoTools (available at [][1]). **Capture design files** - human_hg19_HindIII_design.tar.gz - mouse_mm9_HindIII_design.tar.gz These archives expand into directories that can be provided as "designDir" to CHiCAGO. These contain the HindIII restriction digests for human hg19 and mouse mm9 respectively (.rmap files), the lists of the restriction fragments baited in the original experiments (.baitmap), as well as three types of auxilliary files (.npb, .nbpb and .poe) that have been generated from the respective .rmap and .baitmap files using available as part of chicagoTools. **Feature files** - EncodeDataGM.tar.gz (human) - MouseEncode.tar.gz (mouse) These archives expand into directories containing chromatin features that were used in the peakEnrichment4Features() analysis shown in the Figure 6 of Cairns / Freire-Pritchett et al. The feature list file in each directory can be provided to chicagoTools' runChicago.R (as --en-feat-list) or peakEnrichment4Features (as list_frag). **CHiCAGO output files** - res_GM_merge_final_chicago2.tar.gz (GM12878 cells) - res_mESC_merge_final_chicago2.tar.gz (mESCs) These files expand into output directories generated by the runChicago.R wrapper (available as part of chicagoTools) using the input, design and feature files described above and default settings. Each of these directories contains the subdirectories data, diag_plots, enrichment_data and examples as described below. *The data subdirectory*: - data/*.Rds The full Chicago object containing interaction-level data, input settings and trained parameters. - data/*_params.txt The parameters of runChicago.R and chicagoPipeline() used in the run. - data/*_seqmonk.txt and data/*_washU_text.txt The lists of significant interactions (CHiCAGO score >= 5) in the format readable by Seqmonk ([][2]) or WashU Epigenome browser ([][3]). Note that in the latter case the file should be supplied as a custom track using the option "Got text files instead? Upload them from your computer". *The diag_plots subdirectory* contains diagnostic plots generated by chicagoPipeline(). *The enrichment_data subdirectory* contains the features' enrichment results as a plot and text output generated by peakEnrichment4Features(). *The examples subdirectory* contains a PDF file with the bait profiles of 25 random baits, generated by plotBaits(). **Further details** - See Chicago R package vignette (also available as Additional file 3 in Cairns / Freire-Pritchett et al.) for more details on the output files. - The CHiCAGO homepage is [][4]. The Chicago R package is also available from Bioconductor (release 3.3+). - Please contact Mikhail Spivakov ( for questions regarding these datasets or the CHiCAGO pipeline. [1]: [2]: [3]: [4]: