School of Medicine Publications and Presentations
Rapid, Phase-free Detection of Long Identity-by-Descent Segments Enables Effective Relationship Classification
Document Type
Article
Publication Date
4-2-2020
Abstract
Identity-by-descent (IBD) segments are a useful tool for applications ranging from demographic inference to relationship classification, but most detection methods rely on phasing information and therefore require substantial computation time. As genetic datasets grow, methods for inferring IBD segments that scale well will be critical. We developed IBIS, an IBD detector that locates long regions of allele sharing between unphased individuals, and benchmarked it with Refined IBD, GERMLINE, and TRUFFLE on 3,000 simulated individuals. Phasing these with Beagle 5 takes 4.3 CPU days, followed by either Refined IBD or GERMLINE segment detection in 2.9 or 1.1 h, respectively. By comparison, IBIS finishes in 6.8 min or 7.8 min with IBD2 functionality enabled: speedups of 805–946× including phasing time. TRUFFLE takes 2.6 h, corresponding to IBIS speedups of 20.2–23.3×. IBIS is also accurate, inferring ≥7 cM IBD segments at quality comparable to Refined IBD and GERMLINE. With these segments, IBIS classifies first through third degree relatives in real Mexican American samples at rates meeting or exceeding other methods tested and identifies fourth through sixth degree pairs at rates within 0.0%–2.0% of the top method. While allele frequency-based approaches that do not detect segments can infer relationship degrees faster than IBIS, the fastest are biased in admixed samples, with KING inferring 30.8% fewer fifth degree Mexican American relatives correctly compared with IBIS. Finally, we ran IBIS on chromosome 2 of the UK Biobank dataset and estimate its runtime on the autosomes to be 3.3 days parallelized across 128 cores.
Recommended Citation
Seidman, D. N., Shenoy, S. A., Kim, M., Babu, R., Woods, I. G., Dyer, T. D., Lehman, D. M., Curran, J. E., Duggirala, R., Blangero, J., & Williams, A. L. (2020). Rapid, Phase-free Detection of Long Identity-by-Descent Segments Enables Effective Relationship Classification. American journal of human genetics, 106(4), 453–466. https://doi.org/10.1016/j.ajhg.2020.02.012
Publication Title
Am J Hum Genet
DOI
10.1016/j.ajhg.2020.02.012
Academic Level
faculty
Mentor/PI Department
Office of Human Genetics
Comments
Under an Elsevier user license