Mutations accumulate in the genome of every cell of the body throughout life, causing cancer and other diseases. Most mutations begin as nucleotide mismatches or damage in one of the two strands of the DNA before becoming double-strand mutations if unrepaired or misrepaired. However, current DNA-sequencing technologies cannot accurately resolve these initial single-strand events. Here we develop a single-molecule, long-read sequencing method (Hairpin Duplex Enhanced Fidelity sequencing (HiDEF-seq)) that achieves single-molecule fidelity for base substitutions when present in either one or both DNA strands. HiDEF-seq also detects cytosine deamination-a common type of DNA damage-with single-molecule fidelity. We ... More
Mutations accumulate in the genome of every cell of the body throughout life, causing cancer and other diseases. Most mutations begin as nucleotide mismatches or damage in one of the two strands of the DNA before becoming double-strand mutations if unrepaired or misrepaired. However, current DNA-sequencing technologies cannot accurately resolve these initial single-strand events. Here we develop a single-molecule, long-read sequencing method (Hairpin Duplex Enhanced Fidelity sequencing (HiDEF-seq)) that achieves single-molecule fidelity for base substitutions when present in either one or both DNA strands. HiDEF-seq also detects cytosine deamination-a common type of DNA damage-with single-molecule fidelity. We profiled 134 samples from diverse tissues, including from individuals with cancer predisposition syndromes, and derive from them single-strand mismatch and damage signatures. We find correspondences between these single-strand signatures and known double-strand mutational signatures, which resolves the identity of the initiating lesions. Tumours deficient in both mismatch repair and replicative polymerase proofreading show distinct single-strand mismatch patterns compared to samples that are deficient in only polymerase proofreading. We also define a single-strand damage signature for APOBEC3A. In the mitochondrial genome, our findings support a mutagenic mechanism occurring primarily during replication. As double-strand DNA mutations are only the end point of the mutation process, our approach to detect the initiating single-strand events at single-molecule resolution will enable studies of how mutations arise in a variety of contexts, especially in cancer and ageing.