Detecting chromosomal interactions in Capture Hi-C data with CHiCAGO and companion tools

Paula Freire-Pritchett, Helen Ray-Jones, Monica Della Rosa, Chris Q. Eijsbouts, William R. Orchard, Steven W. Wingett, Chris Wallace, Jonathan Cairns, Mikhail Spivakov, Valeriya Malysheva

Published: 2021-08-08 DOI: 10.1038/s41596-021-00567-5

Extended

Extended Data Fig. 1 Comparative analysis of PCHi-C data generated with a four- and a six-cutter restriction enzyme.

Three MboI PCHi-C replicates obtained from iPSC-derived cardiomyocytes (iPSC CMs 33 ) were processed by CHiCAGO either at the restriction fragment level, using standard 4 bp cutter settings or in 5 kb bins, as described in the Procedure. Three HindIII PCHi-C replicates obtained from hESC-derived cardiomyocytes (hESC CMs 34 ) were processed using standard 6 bp cutter settings. Only genes baited in both iPSC CMs and hESC CMs were included in the comparative analysis. An interaction was considered shared when the middle of the significantly interacting fragments in the MboI data fell within the respective interacting fragments in the HindIII dataset (CHiCAGO score >5). When several interactions in MboI data overlapped with the same HindIII interaction, it was counted as a single shared interaction to avoid double-counting. a , b , Comparison between MboI and HindIII PCHi-C datasets in nonbinned mode ( a ) and binned mode ( b ). The violin plots show the distance distribution of significant interactions belonging to shared, MboI- and HindIII -specific groups. The number of significant interactions in each group is indicated in gray. The barplots show enrichment for regulatory histone marks (as a ratio between observed and expected) in each group of interactions.

Extended Data Fig. 2 QC plots generated by HiCUP for downsampled CHi-C data.

MyLa CHi-C 36 replicate 1 was downsampled to 20 million raw read pairs and processed using HiCUP 19 , as described in the Procedure. a , Truncation, alignment to GRCh37 and pairing results for read 1 (dark blue) and read 2 (light blue). The ~15 million paired reads are taken forwards for filtering. b , Detection of valid Hi-C di-tags (dark blue) and removal of Hi-C artifacts such as religation products (turquoise) and di-tags falling outside the specified size range (orange). c , Size distribution of di-tags with limits shown as red lines. d , Interacting fragments are grouped into cis < 10 kb (dark blue), cis > 10 kb (light blue) and trans (green) for di-tags before removal of PCR duplicates (left) and after (right).

Extended Data Fig. 3 QC plots generated by CHiCAGO for downsampled CHi-C data.

Downsampled CHi-C datasets 36 were processed by CHiCAGO using both replicates per cell line as described in the Procedure. a , Barplot showing the scaling factors (s i ’s) computed for each pool of other ends for MyLa. b , Boxplots showing distribution of technical noise estimates for each pool of baits/viewpoints (top) and for each pool of other ends (bottom) for MyLa. c , Distance dependency of background counts and computed fit (red curve), plotted on a log–log scale for MyLa. d , Interaction profiles for the bait 670997, assigned to rs4141001, in MyLa (top) and HaCaT (bottom). High-scoring interactions detected by CHiCAGO (score ≥5) are shown in red, and subthreshold interactions (3 ≤ score < 5) are shown in blue. e , Number of overlaps between chromatin features of interacting fragments detected using CHiCAGO (yellow bars) versus number of overlaps from 100 random distance-matched subsets of HindIII fragments (blue bars) in MyLa (top) and HaCaT (bottom). Error bars represent 95% confidence intervals.

Extended Data Fig. 4 Identifying differential interactions between conditions using Chicdiff.

a , Dendrogram for downsampled HaCaT and MyLa samples 36 obtained from running getPeakMatrix as outlined in the Procedure. b , Chicdiff 45 bait profiles were generated for four loci as described in the Procedure. The plots show the raw read counts versus linear distance from the bait fragment as mirror images for HaCaT and MyLa. Other-end interacting fragments are pooled and color-coded by their adjusted weighted P -value.

Extended Data Fig. 5 Example of fine-mapping chromatin contacts with Peaky.

The full MyLa CHi-C 36 data were processed by CHiCAGO using both replicates and then analyzed using Peaky 44 . The top panel shows the distribution of raw read counts for other end fragments for the bait 642001, with high-scoring interactions (CHiCAGO score ≥ 5) highlighted in blue. The second panel shows the CHiCAGO adjusted read counts with high-scoring interactions (CHiCAGO score ≥ 5) highlighted in blue and with the Peaky model fitted as a green line. The third panel shows CHiCAGO scores for those interactions with the blue dashed line showing the score cutoff of 5. In the bottom panel, the probability of each other-end fragment being a causal contact is quantified as the marginal posterior probability of contact (MPPC). Based on this metric, a number of fragments with CHiCAGO score ≥ 5 (points highlighted in blue) have MPPC very close to zero. After discounting these, a smaller subset of fine-mapped interactions may be identified.

Quantitative profiling of posttranslational modifications of pathological tau via sarkosyl fractionation and mass spectrometry

References

Extended
Supplementary information

Schoenfelder, S. & Fraser, P. Long-range enhancer-promoter contacts in gene expression control. Nat. Rev. Genet. 20, 437–455 (2019).
Schmitt, A. D., Hu, M. & Ren, B. Genome-wide mapping and analysis of chromosome architecture. Nat. Rev. Mol. Cell Biol. 17, 743–755 (2016).
Lieberman-Aiden, E. et al. Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289–293 (2009).
van Berkum, N. L. et al. Hi-C: a method to study the three-dimensional architecture of genomes. J. Vis. Exp. https://doi.org/10.3791/1869 (2020).
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381–385 (2012).
Sexton, T. et al. Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458–472 (2012).
Schoenfelder, S. et al. The pluripotent regulatory circuitry connecting promoters to their long-range interacting elements. Genome Res. 25, 582–597 (2015).
Mifsud, B. et al. Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C. Nat. Genet. 47, 598–606 (2015).
Sahlén, P. et al. Genome-wide mapping of promoter-anchored interactions with close to single-enhancer resolution. Genome Biol. 16, 156 (2015).
Hughes, J. R. et al. Analysis of hundreds of cis-regulatory landscapes at high resolution in a single, high-throughput experiment. Nat. Genet. 46, 205–212 (2014).
Würtele, H. & Chartrand, P. Genome-wide scanning of HoxB1-associated loci in mouse ES cells using an open-ended chromosome conformation capture methodology. Chromosome Res. 14, 477–495 (2006).
Simonis, M. et al. Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nat. Genet. 38, 1348–1354 (2006).
Zhao, Z. et al. Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat. Genet. 38, 1341–1347 (2006).
Cairns, J. et al. CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data. Genome Biol. 17, 127 (2016).
Huber, W. et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods 12, 115–121 (2015).
Rosa, A., Becker, N. B. & Everaers, R. Looping probabilities in model interphase chromosomes. Biophys. J. 98, 2410–2419 (2010).
Bohn, M. & Heermann, D. W. Diffusion-driven looping provides a consistent framework for chromatin organization. PLoS One 5, e12218 (2010).
Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res. 4, 1310 (2015).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. B Methodol. 57, 289–300 (1995).
Genovese, C. R., Roeder, K. & Wasserman, L. False discovery control with p-value weighting. Biometrika 93, 509–524 (2006).
Ignatiadis, N., Klaus, B., Zaugg, J. B. & Huber, W. Data-driven hypothesis weighting increases detection power in genome-scale multiple testing. Nat. Methods https://doi.org/10.1038/nmeth.3885 (2016).
Freire-Pritchett, P. et al. Global reorganisation of cis-regulatory units upon lineage commitment of human embryonic stem cells. eLife 6, e21926 (2017).
Novo, C. L. et al. Long-range enhancer interactions are prevalent in mouse embryonic stem cells and are reorganized upon pluripotent state transition. Cell Rep. 22, 2615–2627 (2018).
Chovanec, P. et al. Widespread reorganisation of pluripotent factor binding and gene regulatory interactions between human pluripotent states. Nat. Commun. 12, 2098 (2021).
Siersbæk, R. et al. Dynamic rewiring of promoter-anchored chromatin loops during adipocyte differentiation. Mol. Cell 66, 420–435.e5 (2017).
Rubin, A. J. et al. Lineage-specific dynamic and pre-established enhancer-promoter contacts cooperate in terminal differentiation. Nat. Genet. 49, 1522–1528 (2017).
Thiecke, M. J. et al. Cohesin-dependent and -independent mechanisms mediate chromosomal contacts between promoters and enhancers. Cell Rep. 32, 107929 (2020).
Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384.e19 (2016).
Burren, O. S. et al. Chromosome contacts in activated T cells identify autoimmune disease candidate genes. Genome Biol. 18, 165 (2017).
Petersen, R. et al. Platelet function is modified by common sequence variation in megakaryocyte super enhancers. Nat. Commun. 8, 16058 (2017).
Litchfield, K. et al. Identification of 19 new risk loci and potential regulatory mechanisms influencing susceptibility to testicular germ cell tumor. Nat. Genet. 49, 1133–1140 (2017).
Montefiori, L. E. et al. A promoter interaction map for cardiovascular disease genetics. eLife 7, e35788 (2018).
Choy, M. K. et al. Promoter interactome of human embryonic stem cell-derived cardiomyocytes connects GWAS regions to cardiac gene networks. Nat. Commun. 9, 2526 (2018).
Joshi, O. et al. Dynamic reorganization of extremely long-range promoter-promoter interactions between two states of pluripotency. Cell Stem Cell 17, 748–757 (2015).
Ray-Jones, H. et al. Mapping DNA interaction landscapes in psoriasis susceptibility loci highlights KLF4 as a target gene in 9q31. BMC Biol. 18, 47 (2020).
Martin, P. et al. Chromatin interactions reveal novel gene targets for drug repositioning in rheumatic diseases. Ann. Rheum. Dis. 78, 1127–1134 (2019).
Ghavi-Helm, Y. et al. Highly rearranged chromosomes reveal uncoupling between genome topology and gene expression. Nat. Genet. 51, 1272–1282 (2019).
Andrey, G. et al. Characterization of hundreds of regulatory landscapes in developing limbs reveals two regimes of chromatin folding. Genome Res. 27, 223–233 (2017).
Su, C. et al. Mapping effector genes at lupus GWAS loci using promoter Capture-C in follicular helper T cells. Nat. Commun. 11, 3294 (2020).
Chesi, A. et al. Genome-scale Capture C promoter interactions implicate effector genes at GWAS loci for bone mineral density. Nat. Commun. 10, 1260 (2019).
Anil, A., Spalinskas, R., Åkerborg, Ö. & Sahlén, P. HiCapTools: a software suite for probe design and proximity detection for targeted chromosome conformation capture applications. Bioinformatics 34, 675–677 (2018).
Ben Zouari, Y., Molitor, A. M., Sikorska, N., Pancaldi, V. & Sexton, T. ChiCMaxima: a robust and simple pipeline for detection and visualization of chromatin looping in Capture Hi-C. Genome Biol. 20, 102 (2019).
Eijsbouts, C. Q., Burren, O. S., Newcombe, P. J. & Wallace, C. Fine mapping chromatin contacts in capture Hi-C data. BMC Genomics 20, 77 (2019).
Cairns, J., Orchard, W. R., Malysheva, V. & Spivakov, M. Chicdiff: a computational pipeline for detecting differential chromosomal interactions in Capture Hi-C data. Bioinformatics 35, 4764–4766 (2019).
Holgersen, E. M. et al. Identifying high-confidence capture Hi-C interactions using CHiCANE. Nat. Protoc. 16, 2257–2285 (2021).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Thiecke, M. J. et al. Cohesin-dependent and -independent mechanisms mediate chromosomal contacts between promoters and enhancers. Cell Rep. 32, 107929 (2020).
Ay, F., Bailey, T. L. & Noble, W. S. Statistical confidence estimation for Hi-C data reveals regulatory chromatin contacts. Genome Res. 24, 999–1011 (2014).
Heinz, S. et al. Transcription elongation can affect genome 3D structure. Cell 174, 1522–1536.e22 (2018).
Imakaev, M. et al. Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 999–1003 (2012).
Beccari, L. et al. Dbx2 regulation in limbs suggests inter-TAD sharing of enhancers. Dev. Dyn. https://doi.org/10.1002/dvdy.303 (2021).
Su, C., Pahl, M. C., Grant, S. F. A. & Wells, A. D. Restriction enzyme selection dictates detection range sensitivity in chromatin conformation capture-based variant-to-gene mapping approaches. Preprint at bioRxiv https://doi.org/10.1101/2020.12.15.422932 (2020).
Disney-Hogg, L., Kinnersley, B. & Houlston, R. Algorithmic considerations when analysing capture Hi-C data. Wellcome Open Res. 5, 289 (2020).
Feldmann, A., Dimitrova, E., Kenney, A., Lastuvkova, A. & Klose, R. J. CDK-Mediator and FBXL19 prime developmental genes for activation by promoting atypical regulatory interactions. Nucleic Acids Res. 48, 2942–2955 (2020).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Zhou, X. et al. The Human Epigenome Browser at Washington University. Nat. Methods 8, 989–990 (2011).
Zhou, X. et al. Exploring long-range genome interactions using the WashU Epigenome Browser. Nat. Methods 10, 375–376 (2013).

Detecting chromosomal interactions in Capture Hi-C data with CHiCAGO and companion tools

Extended

Supplementary information

推荐阅读