Document of bibliographic reference 107223
BibliographicReference record
- Type
- Bibliographic resource
- Type of document
- Book chapters
- Type of document
- Conference paper
- BibLvlCode
- AMS
- Title
- Identifying erroneous data using outlier detection techniques
- Abstract
- Common data quality problems observed in OBIS are described. BSCAN, a density-based clustering algorithm for large spatial data bases is employed to identify geographical outliers in federated data from a public Web service on the OBIS Portal. The algorithm is shown to be effective and efficient for this purpose. The relationship between outliers and erroneous data points are discussed and the future plan to develop an operational data quality checking tool based on this algorithm is discussed.
- Bibliographic citation
- Zhuang, W.; Zhang, Y.; Grassle, J.F. (2007). Identifying erroneous data using outlier detection techniques, in: Vanden Berghe, E. et al. (Ed.) Proceedings Ocean Biodiversity Informatics: International Conference on Marine Biodiversity Data Management, Hamburg, Germany 29 November to 1 December, 2004. VLIZ Special Publication, 37: pp. 187-192
- Topic
- Marine
- Access rights
- open access
- Is accessible for free
- true
thesaurus terms
- term
- Clustering (term code: 62987 - defined in term set: CSA Technology Research Database Master Thesaurus)
- Clustering (term code: 111141 - defined in term set: CAB Thesaurus)
- Data (term code: 2086 - defined in term set: ASFA Thesaurus List)
- Quality assurance (term code: 6649 - defined in term set: ASFA Thesaurus List)
- Quality control (term code: 6650 - defined in term set: ASFA Thesaurus List)
Other terms
- other terms associated with this publication
- Data quality solving