The design and construction of reference pangenome graphs

13 Mar 2020  ·  Heng Li, Xiaowen Feng, Chong Chu ·

The recent advances in sequencing technologies enables the assembly of individual genomes to the reference quality. How to integrate multiple genomes from the same species and to make the integrated representation accessible to biologists remain an open challenge. Here we propose a graph-based data model and associated formats to represent multiple genomes while preserving the coordinate of the linear reference genome. We implemented our ideas in the minigraph toolkit and demonstrate that we can efficiently construct a pangenome graph and compactly encode tens of thousands of structural variants missing from the current reference genome.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here