Document of bibliographic reference 405815

BibliographicReference record

Type
Bibliographic resource
Type of document
Journal article
BibLvlCode
AS
Title
dataFishing: An efficient Python tool and user-friendly web-form for mining mitochondrial and chloroplast sequences, taxonomic, and biodiversity data
Abstract
NCBI GenBank and BOLD Systems are important databases for biodiversity research, in which the deposited data can be used for various purposes, such as species identification analysis, evolutionary studies, biodiversity monitoring, as well as assessing the effects of possible climate changes on species distributions. Other information, such as taxonomy, collection site locations, and conservation status, is often critical for these studies. Some databases, such as GBIF, BOLD Systems, and GenBank, provide data on the taxonomy, habitat, and geographic distribution of various taxonomic groups, while others, such as WoRMS and IUCN, have specific data on marine species and conservation status. However, depending on the taxonomic group studied, searches in these databases can encompass dozens or hundreds of queries, forcing researchers to conduct extensive searches in each database, which is a time-consuming and error-prone process. To facilitate and automate access to this information, we introduce dataFishing, a Python script and a web form. dataFishing is faster and more efficient than other R packages, such as bold, taxize, rgbif, rredlist, and worrms, for obtaining taxonomic information from the consulted databases. Moreover, it allows the retrieval of DNA sequences, common names, synonyms, conservation status, and species occurrence points. This tool is free and will enable a more systematized and time-efficient search, which tends to facilitate such data inquiries.
Bibliographic citation
Rabelo, L.; Sodré, D.; Balcázar, O.D.A.; do Rosário, M.F.; Guimarães-Costa, A.J.; Gomes, G.; Sampaio, I.; Vallinoto, M. (2025). dataFishing: An efficient Python tool and user-friendly web-form for mining mitochondrial and chloroplast sequences, taxonomic, and biodiversity data. Ecological Informatics 85: 102970. https://dx.doi.org/10.1016/j.ecoinf.2024.102970
Is peer reviewed
true
Access rights
open access
Is accessible for free
true

Authors

author
Name
Luan Rabelo
author
Name
Davidson Sodré
author
Name
Oscar David Albito Balcázar
author
Name
Murilo Furtado do Rosário
author
Name
Aurycéia Jaquelyne Guimarães-Costa
author
Name
Grazielle Gomes
author
Name
Iracilda Sampaio
author
Name
Marcelo Vallinoto

Links

referenced creativework
type
DOI
accessURL
https://dx.doi.org/10.1016/j.ecoinf.2024.102970

thesaurus terms

term
Bioinformatics (term code: 61952 - defined in term set: CSA Technology Research Database Master Thesaurus)
Taxonomy (term code: 8377 - defined in term set: ASFA Thesaurus List)

Document metadata

date created
2025-02-24
date modified
2025-02-24