{"refrec":{"BRefID":310688,"RR":"<b>Kopperud, B.T.; Lidgard, S.; Liow, L.H.</b> (2019). Text-mined fossil biodiversity dynamics using machine learning. <i>Proc. - Royal Soc., Biol. Sci. 286(1901)</i>: 20190022. <a href=\"https://dx.doi.org/10.1098/rspb.2019.0022\" target=\"_blank\">https://dx.doi.org/10.1098/rspb.2019.0022</a>","BEntID":303049,"PublicFlag":1,"CheckedFlag":0,"wosflag":1,"vabbflag":1,"RefStringPartII":". <i>Proc. - Royal Soc., Biol. Sci. 286(1901)</i>: 20190022. <a href=\"https://dx.doi.org/10.1098/rspb.2019.0022\" target=\"_blank\">https://dx.doi.org/10.1098/rspb.2019.0022</a>","DocTypID":8,"DocType":"Journal article","MarineFlag":0,"FreshFlag":0,"BrackishFlag":0,"TerrestrialFlag":0,"Authorstring":"Kopperud, B.T.; Lidgard, S.; Liow, L.H.","OrigTitleTranslFlag":0,"Authorstringtrunc":"Kopperud, B.T. <i>et al.</i>","Englishabstract":"Documented occurrences of fossil taxa are the empirical foundation for understanding large-scale biodiversity changes and evolutionary dynamics in deep time. The fossil record contains vast amounts of understudied taxa. Yet the compilation of huge volumes of data remains a labour-intensive impediment to a more complete understanding of Earth’s biodiversity history. Even so, many occurrence records of species and genera in these taxa can be uncovered in the palaeontological literature. Here, we extract observations of fossils and their inferred ages from unstructured text in books and scientific articles using machine-learning approaches. We use Bryozoa, a group of marine invertebrates with a rich fossil record, as a case study. Building on recent advances in computational linguistics, we develop a pipeline to recognize taxonomic names and geologic time intervals in published literature and use supervised learning to machine-read whether the species in question occurred in a given age interval. Intermediate machine error rates appear comparable to human error rates in a simple trial, and resulting genus richness curves capture the main features of published fossil diversity studies of bryozoans. We believe our automated pipeline, that greatly reduced the time required to compile our dataset, can help others compile similar data for other taxa.","AbstractOtherLang":null,"BibLvlCode":"AS","StandardTitle":"Text-mined fossil biodiversity dynamics using machine learning","OrigTitleLangCode":"en","OrigTitleLangCodeExtended":"eng","OrigTitleLangID":15,"DateLastModified":{"date":"2026-06-15 01:33:17.568350","timezone_type":1,"timezone":"+02:00"},"UserAccessRight":null,"UserAccID":null,"AuthorKeywords":"cheilostome bryozoans; fossil occurrences; palaeobiodiversity; natural language processing; information extraction; literature compilation","OtherDescriptors":null,"Notes":null,"AnaPub":2019,"MonPub":null,"DateUpdate":"2019-04-30","DateCreate":"2019-04-30","SecASFANote":null,"ConfID":null,"PeerRev":1,"VlizCoreFlag":1,"WoScode":"WOS:000465657800007","VABBcode":null,"OpenAcc":1,"DOI":"10.1098/rspb.2019.0022"},"refs":null,"anarec":{"AnaID":310688,"PubliDate":2019,"Pagination":"20190022","XtraPublOfAnaID":null,"ISBN":null,"Volume":"286","Issue":"1901","BRefMon":null,"BRefMonRR":null,"BRefXtra":null,"BRefXtraRR":null,"SerBRefID":204863,"SerRR":"Proceedings of the Royal Society of London. Series B. The Royal Society: London.  ISSN 0962-8452; e-ISSN 1471-2954","StandardTitleSer":"Proceedings of the Royal Society of London. Series B","ISSN":"0962-8452","AbbrevSer":"Proc. - Royal Soc., Biol. Sci.","StandardTitleMon":null,"StartPage":null,"Pages":null,"ToPubliDate":null,"BRefBibLvlCode":"S","SerNotes":null},"monrec":null,"serrec":null,"relations":null,"relationsRev":null,"addrec":null,"othpubs":null,"ownerships":null,"authors":[{"AutName":"Kopperud","Firstname":"Bjørn Tore","Initials":"B.T.","Affiliation":null,"Discriminator":null,"CorporateFlag":0,"BEntID":303049,"AutID":374922,"OrderNr":1,"DegrID":null,"EditorFlag":0,"CorrespFlag":0,"IllustratorFlag":0,"ReviserFlag":0,"TranslatorFlag":0,"InsAcronym":null,"InsFSN":null,"ORCID":null,"PersID":null,"InsID":null},{"AutName":"Lidgard","Firstname":"Scott","Initials":"S.","Affiliation":"Field Museum Nat Hist, Dept Geol, Chicago, IL 60605 USA.","Discriminator":null,"CorporateFlag":0,"BEntID":303049,"AutID":310931,"OrderNr":2,"DegrID":null,"EditorFlag":0,"CorrespFlag":0,"IllustratorFlag":0,"ReviserFlag":0,"TranslatorFlag":0,"InsAcronym":null,"InsFSN":null,"ORCID":null,"PersID":null,"InsID":null},{"AutName":"Liow","Firstname":"Lee Hsiang","Initials":"L.H.","Affiliation":"Univ Oslo, Dept Biosci, Ctr Ecol & Evolutionary Synth, N-0316 Oslo, Norway.","Discriminator":null,"CorporateFlag":0,"BEntID":303049,"AutID":295349,"OrderNr":3,"DegrID":null,"EditorFlag":0,"CorrespFlag":0,"IllustratorFlag":0,"ReviserFlag":0,"TranslatorFlag":0,"InsAcronym":null,"InsFSN":null,"ORCID":null,"PersID":null,"InsID":null}],"mapdetails":null,"datasets":null,"monographs":null,"monparts":null,"serparts":null,"BEntOpen":null,"BEntPrivate":null,"availability":[{"BInstID":328511,"LibID":36,"BRefID":310688,"EmbargoDate":null,"FullEmbargoDate":null,"PhysMedID":16,"hasOCRd":1,"ShelfLocCode":"328511","RFID":null,"PaidValue":null,"Medium":"Server","Description":"VLIZ Open Access","Acronym":"VLIZ","Library":"Vlaams Instituut voor de Zee","DutchTerm":"Open access","URL":null,"ClassifID":53,"Classification":"Open access","ReqLink":null,"ClassifTypID":1,"URLLocation":"https://www.vliz.be/imisdocs/publications/","SubDir":null,"InternalReq":0,"LoggedInReq":0,"Disclaimer":null,"DutchDisclaimer":null,"FileFormat":".pdf","FileDescr":"pdf","InsPub":1,"InsID":36,"FileFormID":6,"LendableFlag":1,"PublicFlag":1,"orderLib":"A","Notes":null,"AccConID":null,"AccessConstraint":null,"LicURL":null}],"litstyles":null,"thespers":null,"arch2discl":null,"SERpubls":[{"PublName":"The Royal Society","City":"London"}],"MONpubls":null,"pictures":[],"thestermsPath":null,"thestermsASFA":null,"taxtermsASFA":[{"TaxTerm":"Bryozoa"}],"geotermsASFA":null,"collections":[{"Collection":"VLIZ Acknowledged Publications","ShortName":"VLIZ ackn"}],"conf":null,"proj":null,"Physdatasets":null,"spcols":{"941":{"SpName":"LifeWatch Species Information Backbone","SpColID":941,"ParSpColID":39,"TopParID":39,"ShortName":"LifeWatch Species Information Backbone","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":39,"SpColPath":"VLIZ ackn/LifeWatch Species Information Backbone"},"39":{"SpName":"VLIZ Acknowledged Publications","SpColID":39,"ParSpColID":null,"TopParID":null,"ShortName":"VLIZ ackn","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":null,"SpColPath":"VLIZ ackn"},"507":{"SpName":"World Register of Marine Species","SpColID":507,"ParSpColID":null,"TopParID":null,"ShortName":"WoRMS website","URLLocation":null,"LibID":null,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":null,"SpColPath":"WoRMS website"},"915":{"SpName":"World Register of Marine Species (WoRMS) acknowledged","SpColID":915,"ParSpColID":941,"TopParID":39,"ShortName":"WoRMS ackn","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":39,"SpColPath":"VLIZ ackn/LifeWatch Species Information Backbone/WoRMS ackn"},"947":{"SpName":"WoRMS ackn - direct reference","SpColID":947,"ParSpColID":915,"TopParID":39,"ShortName":"WoRMS ackn - direct","URLLocation":null,"LibID":36,"OpenRepoFlag":null,"SpTypID":null,"TopParIDNotWebsite":39,"SpColPath":"VLIZ ackn/LifeWatch Species Information Backbone/WoRMS ackn/WoRMS ackn - direct"}},"doi":null,"publs":null,"serparttypes":null,"monauthors":null,"MParts":null,"SParts":null,"hLibs":null,"langs":[{"BEntID":303049,"AbstractFlag":0,"LangID":15,"LangCode":"en","Lang":"English","DutchTerm":"Engels","LangCodeExtended":"eng"},{"BEntID":303049,"AbstractFlag":1,"LangID":15,"LangCode":"en","Lang":"English","DutchTerm":"Engels","LangCodeExtended":"eng"}],"urls":[{"URL":"https://dx.doi.org/10.1098/rspb.2019.0022","externalID":"10.1098/rspb.2019.0022","URLTypeCode":"DOI","URLID":75851,"URLTypID":13,"URLType":"DOI","URLPrefix":"http://dx.doi.org/"}],"thesterms":null,"taxterms":[{"TaxTerm":"Bryozoa","AphiaID":146142,"TaxtID":19527}],"geoterms":null,"othterms":null,"asfacodes":null,"asfa2codes":null,"thestermsFRIS":null,"taxtermsFRIS":[{"TaxTerm":"Bryozoa","DutchTerm":"Mosdiertjes","AphiaID":146142,"TaxtID":19527}],"geotermsFRIS":null,"othtermsFRIS":null,"resmessage":"","complete":1,"sessions":{"newSesName":"Chisala, Chilekwa, C.","newSesDate":{"date":"2019-04-30 09:15:27.477000","timezone_type":3,"timezone":"Europe/Brussels"},"updSesName":"Chisala, Chilekwa, C.","updSesDate":{"date":"2019-04-30 09:15:27.477000","timezone_type":3,"timezone":"Europe/Brussels"}}}
