{"refrec":{"BRefID":253418,"RR":"<b>Zermoglio, P.F.; Guralnick, R.P.; Wieczorek, J.R.</b> (2016). A standardized Reference Data Set for Vertebrate Taxon Name Resolution. <i>PLoS One 11(1)</i>: e0146894. <a href=\"http://dx.doi.org/10.1371/journal.pone.0146894\" target=\"_blank\">http://dx.doi.org/10.1371/journal.pone.0146894</a>","BEntID":245231,"PublicFlag":1,"CheckedFlag":1,"wosflag":1,"vabbflag":1,"RefStringPartII":". <i>PLoS One 11(1)</i>: e0146894. <a href=\"http://dx.doi.org/10.1371/journal.pone.0146894\" target=\"_blank\">http://dx.doi.org/10.1371/journal.pone.0146894</a>","DocTypID":8,"DocType":"Journal article","MarineFlag":0,"FreshFlag":0,"BrackishFlag":0,"TerrestrialFlag":0,"Authorstring":"Zermoglio, P.F.; Guralnick, R.P.; Wieczorek, J.R.","OrigTitleTranslFlag":0,"Authorstringtrunc":"Zermoglio, P.F. <i>et al.</i>","Englishabstract":"Taxonomic names associated with digitized biocollections labels have flooded into repositories such as GBIF, iDigBio and VertNet. The names on these labels are often misspelled, out of date, or present other problems, as they were often captured only once during accessioning of specimens, or have a history of label changes without clear provenance. Before records are reliably usable in research, it is critical that these issues be addressed. However, still missing is an assessment of the scope of the problem, the effort needed to solve it, and a way to improve effectiveness of tools developed to aid the process. We present a carefully humanvetted analysis of 1000 verbatim scientific names taken at random from those published via the data aggregator VertNet, providing the first rigorously reviewed, reference validation data set. In addition to characterizing formatting problems, human vetting focused on detecting misspelling, synonymy, and the incorrect use of Darwin Core. Our results reveal a sobering view of the challenge ahead, as less than 47%of name strings were found to be currently valid. More optimistically, nearly 97%of name combinations could be resolved to a currently valid name, suggesting that computer-aided approaches may provide feasible means to improve digitized content. Finally, we associated names back to biocollections records and fit logistic models to test potential drivers of issues. A set of candidate variables (geographic region, year collected, higher-level clade, and the institutional digitally accessible data volume) and their 2-way interactions all predict the probability of records having taxon name issues, based on model selection approaches.We strongly encourage further experiments to use this reference data set as a means to compare automated or computer-aided taxon name tools for their ability to resolve and improve the existing wealth of legacy data.","AbstractOtherLang":null,"BibLvlCode":"AS","StandardTitle":"A standardized Reference Data Set for Vertebrate Taxon Name Resolution","OrigTitleLangCode":"en","OrigTitleLangCodeExtended":"eng","OrigTitleLangID":15,"DateLastModified":{"date":"2024-12-10 01:33:01.897972","timezone_type":1,"timezone":"+01:00"},"UserAccessRight":null,"UserAccID":null,"AuthorKeywords":null,"OtherDescriptors":null,"Notes":null,"AnaPub":2016,"MonPub":null,"DateUpdate":"2018-02-13","DateCreate":"2016-02-29","SecASFANote":null,"ConfID":null,"PeerRev":1,"VlizCoreFlag":1,"WoScode":"WOS:000368033100051","VABBcode":null,"OpenAcc":1,"DOI":"10.1371/journal.pone.0146894"},"refs":null,"anarec":{"AnaID":253418,"PubliDate":2016,"Pagination":"e0146894","XtraPublOfAnaID":null,"ISBN":null,"Volume":"11","Issue":"1","BRefMon":null,"BRefMonRR":null,"BRefXtra":null,"BRefXtraRR":null,"SerBRefID":123954,"SerRR":"PLoS One. Public Library of Science: San Francisco.  ISSN 1932-6203; e-ISSN 1932-6203","StandardTitleSer":"PLoS One","ISSN":"1932-6203","AbbrevSer":"PLoS One","StandardTitleMon":null,"StartPage":null,"Pages":null,"ToPubliDate":null,"BRefBibLvlCode":"S","SerNotes":null},"monrec":null,"serrec":null,"relations":null,"relationsRev":null,"addrec":null,"othpubs":null,"ownerships":null,"authors":[{"AutName":"Zermoglio","Firstname":"Paula","Initials":"P.F.","Affiliation":"Univ Buenos Aires, Fac Ciencias Exactas & Nat, Inst IEGEBA CONICET UBA, Dept Ecol Genet & Evoluc, Buenos Aires, DF, Argentina.","Discriminator":null,"CorporateFlag":0,"BEntID":245231,"AutID":293740,"OrderNr":1,"DegrID":null,"EditorFlag":0,"CorrespFlag":0,"IllustratorFlag":0,"ReviserFlag":0,"TranslatorFlag":0,"InsAcronym":null,"InsFSN":null,"ORCID":null,"PersID":null,"InsID":null},{"AutName":"Guralnick","Firstname":"Robert","Initials":"R.P.","Affiliation":"Univ Florida, Univ Florida Museum Nat Hist, Gainesville, FL USA.","Discriminator":null,"CorporateFlag":0,"BEntID":245231,"AutID":293741,"OrderNr":2,"DegrID":null,"EditorFlag":0,"CorrespFlag":0,"IllustratorFlag":0,"ReviserFlag":0,"TranslatorFlag":0,"InsAcronym":null,"InsFSN":null,"ORCID":null,"PersID":null,"InsID":null},{"AutName":"Wieczorek","Firstname":"John","Initials":"J.R.","Affiliation":"Univ Calif Berkeley, Museum Vertebrate Zool, Berkeley, CA 94720 USA.","Discriminator":null,"CorporateFlag":0,"BEntID":245231,"AutID":293742,"OrderNr":3,"DegrID":null,"EditorFlag":0,"CorrespFlag":0,"IllustratorFlag":0,"ReviserFlag":0,"TranslatorFlag":0,"InsAcronym":null,"InsFSN":null,"ORCID":null,"PersID":null,"InsID":null}],"mapdetails":null,"datasets":null,"monographs":null,"monparts":null,"serparts":null,"BEntOpen":null,"BEntPrivate":null,"availability":[{"BInstID":286193,"LibID":36,"BRefID":253418,"EmbargoDate":null,"FullEmbargoDate":null,"PhysMedID":16,"hasOCRd":1,"ShelfLocCode":"286193","RFID":null,"PaidValue":null,"Medium":"Server","Description":"VLIZ Open Access","Acronym":"VLIZ","Library":"Vlaams Instituut voor de Zee","DutchTerm":"Open access","URL":null,"ClassifID":53,"Classification":"Open access","ReqLink":null,"ClassifTypID":1,"URLLocation":"https://www.vliz.be/imisdocs/publications/","SubDir":null,"InternalReq":0,"LoggedInReq":0,"Disclaimer":null,"DutchDisclaimer":null,"FileFormat":".pdf","FileDescr":"pdf","InsPub":1,"InsID":36,"FileFormID":6,"LendableFlag":1,"PublicFlag":1,"orderLib":"A","Notes":null,"AccConID":null,"AccessConstraint":null,"LicURL":null}],"litstyles":null,"thespers":null,"arch2discl":null,"SERpubls":[{"PublName":"Public Library of Science","City":"San Francisco"}],"MONpubls":null,"pictures":[],"thestermsPath":null,"thestermsASFA":null,"taxtermsASFA":null,"geotermsASFA":null,"collections":[{"Collection":"VLIZ Acknowledged Publications","ShortName":"VLIZ ackn"}],"conf":null,"proj":null,"Physdatasets":null,"spcols":{"955":{"SpName":"Catalogue of Life acknowledged","SpColID":955,"ParSpColID":null,"TopParID":null,"ShortName":"Catalogue of Life ackn","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":null,"SpColPath":"Catalogue of Life ackn"},"941":{"SpName":"LifeWatch Species Information Backbone","SpColID":941,"ParSpColID":39,"TopParID":39,"ShortName":"LifeWatch Species Information Backbone","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":39,"SpColPath":"VLIZ ackn/LifeWatch Species Information Backbone"},"39":{"SpName":"VLIZ Acknowledged Publications","SpColID":39,"ParSpColID":null,"TopParID":null,"ShortName":"VLIZ ackn","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":null,"SpColPath":"VLIZ ackn"},"507":{"SpName":"World Register of Marine Species","SpColID":507,"ParSpColID":null,"TopParID":null,"ShortName":"WoRMS website","URLLocation":null,"LibID":null,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":null,"SpColPath":"WoRMS website"},"915":{"SpName":"World Register of Marine Species (WoRMS) acknowledged","SpColID":915,"ParSpColID":941,"TopParID":39,"ShortName":"WoRMS ackn","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":39,"SpColPath":"VLIZ ackn/LifeWatch Species Information Backbone/WoRMS ackn"},"947":{"SpName":"WoRMS ackn - direct reference","SpColID":947,"ParSpColID":915,"TopParID":39,"ShortName":"WoRMS ackn - direct","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":39,"SpColPath":"VLIZ ackn/LifeWatch Species Information Backbone/WoRMS ackn/WoRMS ackn - direct"}},"doi":null,"publs":null,"serparttypes":null,"monauthors":null,"MParts":null,"SParts":null,"hLibs":null,"langs":[{"BEntID":245231,"AbstractFlag":0,"LangID":15,"LangCode":"en","Lang":"English","DutchTerm":"Engels","LangCodeExtended":"eng"},{"BEntID":245231,"AbstractFlag":1,"LangID":15,"LangCode":"en","Lang":"English","DutchTerm":"Engels","LangCodeExtended":"eng"}],"urls":[{"URL":"http://dx.doi.org/10.1371/journal.pone.0146894","externalID":"10.1371/journal.pone.0146894","URLTypeCode":"DOI","URLID":43412,"URLTypID":13,"URLType":"DOI","URLPrefix":"http://dx.doi.org/"}],"thesterms":null,"taxterms":null,"geoterms":null,"othterms":null,"asfacodes":null,"asfa2codes":null,"thestermsFRIS":null,"taxtermsFRIS":null,"geotermsFRIS":null,"othtermsFRIS":null,"resmessage":"","complete":1,"sessions":{"newSesName":"Chisala, Chilekwa, C.","newSesDate":{"date":"2016-02-29 08:50:44.750000","timezone_type":3,"timezone":"Europe/Brussels"},"updSesName":"Lyssens, Liesbeth, L.","updSesDate":{"date":"2018-02-13 19:34:46.407000","timezone_type":3,"timezone":"Europe/Brussels"}}}
