Abstract
Hi-C data is commonly normalized using single sample processing methods, with focus on comparisons between regions within a given contact map. Here, we aim to compare contact maps across different samples. We demonstrate that unwanted variation, of likely technical origin, is present in Hi-C data with replicates from different individuals, and that properties of this unwanted variation changes across the contact map. We present BNBC, a method for normalization and batch correction of Hi-C data and show that it substantially improves comparisons across samples, including in a QTL analysis as well as differential enrichment across cell types.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Extensive revision. 1. An error was found in our ICE implementation, but not in ICE-OE. We have dropped ICE-OE. 2. We have added the analysis of a new dataset with a between cell type differential enrichment.