Despite over the last two decades the reference human genome sequence has served as the foundation for genetic and biomedical research and applications, there is a broad consensus that no single reference sequence can represent the genomic diversity of global populations.
On one hand, high-quality population-specific and haplotype-resolved genome references are necessary for genetic and medical analysis. On the other hand, there is a clear need to shift from a single reference to a pangenome form that better represents genomic diversity, or allelic variation within and across human populations.
Here, we present the first effort (Phase I) of the Chinese Pangenome Consortium (CPC) with the draft CPC pangenome reference based on 116 high-quality haplotype-resolved assemblies from 58 core samples representing 36 minority Chinese ethnic groups and 6 assemblies of the Han Chinese.
-
- The Pangenome References built based on the CPC core samples and that combined with the HPRC samples are freely available from the POG website.
-
-
Assemblies of 57 samples only using HiFi reads (including low-quality Assemblies of 10 non core samples).
-
Assemblies of another 11 samples using HiFi reads and paid end Hi-C reads.
-
The above files are available as described in the "Data availability" section of the paper "A pangenome reference of 36 Chinese populations"
The processing flow and details can be obtained from the protocol.
Gao, Y., Yang, X., Chen, H. et al. A pangenome reference of 36 Chinese populations. Nature (2023). https://doi.org/10.1038/s41586-023-06173-7
- Correspondence and requests for materials should be addressed to S.X. (Email: [email protected]).
- For technical questions and bug reports, please contact Y.G. (Email: [email protected]).
Chinese Pangenome Consortium | |
---|---|
Human Population Omics Group |