proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes
Fullam, A.; Letunic, I.; Schmidt, T.S.B.; Ducarmon, Q.R.; Karcher, N.; Khedkar, S.; Kuhn, M.; Larralde, M.; Maistrenko, O.M.; Malfertheiner, L.; Milanese, A.; Rodrigues, J.F.M.; Sanchis-López, C.; Schudoma, C.; Szklarczyk, D.; Sunagawa, S.; Zeller, G.; Huerta-Cepas, J.; von Mering, C.; Bork, P.; Mende, D.R. (2022). proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes. Nucleic Acids Res. 51(D1): D760-D766. https://dx.doi.org/10.1093/nar/gkac1078Additional data: In: Nucleic Acids Research. Information Retrieval: London. ISSN 0305-1048; e-ISSN 1362-4962, more | |
Abstract | The interpretation of genomic, transcriptomic and other microbial ‘omics data is highly dependent on the availability of well-annotated genomes. As the number of publicly available microbial genomes continues to increase exponentially, the need for quality control and consistent annotation is becoming critical. We present proGenomes3, a database of 907 388 high-quality genomes containing 4 billion genes that passed stringent criteria and have been consistently annotated using multiple functional and taxonomic databases including mobile genetic elements and biosynthetic gene clusters. proGenomes3 encompasses 41 171 species-level clusters, defined based on universal single copy marker genes, for which pan-genomes and contextual habitat annotations are provided. The database is available at http://progenomes.embl.de/ |
|