Freitas, TMS, LFA Montag, P De Marco Jr & J Hortal, 2020. How reliable are species identifications in biodiversity big data? Evaluating the records of a neotropical fish family in online repositories, Systematics and Biodiversity, doi: 10.1080/14772000.2020.1730473
Abstract
The increase of free and open online biodiversity databases is of paramount importance for current research in ecology and evolution. However, little attention is paid to using updated taxonomy in these “biodiversity big data” repositories and the quality of their taxonomic information is often questioned. Here we assess how reliable is the current use of nomenclatural classification in the distributional information available from two biodiversity information networks: GBIF and the Brazilian SpeciesLink. We use as a study case the records of Auchenipteridae, a Neotropical fish family that has been subject to recent taxonomical reviews. A data filtering procedure was applied to identify and quantify the inaccuracies in the taxonomical status of the records in three steps: assessment of identification accuracy at the family, genus or species level; current validity of species name; and assignation of inaccurate species records to different categories of classification quality. Synonyms, nonexistent combinations, and outdated combinations were reassigned to currently valid species. A total of 9148 records of Auchenipteridae fishes were analyzed, of which 4165 were from GBIF and 4983 from SpeciesLink, deriving from 46 and 31 sources, respectively. After correcting all possible records following the taxonomic data filtering steps, 6988 records (76.4% of the original) were adequate for describing species distributions, while 2160 remained inaccurate. The most inaccurate records at the species level were due to the use of outdated nomenclatures, resulting in non-valid combinations of species and genus, and synonymy. Our results evidence a large taxonomic inconsistency among records, and, most importantly, that taxonomic information obtained from repositories should be used with caution. Many inaccuracy issues may be embedded in the biodiversity databases’ records, which could lead researchers to provide an incomplete or even mistaken perspective of the variations in the natural world.
Be careful using data from online repositories
- Silurus
- Posts: 12378
- Joined: 31 Dec 2002, 11:35
- I've donated: $12.00!
- My articles: 55
- My images: 884
- My catfish: 1
- My cats species list: 90 (i:0, k:0)
- Spotted: 419
- Location 1: Singapore
- Location 2: Moderator Emeritus
- bekateen
- Posts: 8994
- Joined: 09 Sep 2014, 17:50
- I've donated: $40.00!
- My articles: 4
- My images: 130
- My cats species list: 142 (i:102, k:39)
- My aquaria list: 36 (i:13)
- My BLogs: 44 (i:149, p:2671)
- My Wishlist: 35
- Spotted: 177
- Location 1: USA, California, Stockton
- Location 2: USA, California, Stockton
- Contact:
Re: Be careful using data from online repositories
Here's the link: https://www.tandfonline.com/doi/abs/10. ... 20.1730473
Good advice, always.
Cheers, Eric
Good advice, always.
Since they were running "big data"-level analytics on "big data," the one thing I imagine they couldn't do systematically was validate whether the individual identifications were correct in the first place, irrespective of reclassifications, nomenclature changes, and the like... Things such as the not-uncommon-habit of assigning undescribed species in new locations to otherwise familiar and similar looking fish from distant locations, in spite of being in vastly different drainages, etc.Freitas et al. (2020) wrote:A data filtering procedure was applied to identify and quantify the inaccuracies in the taxonomical status of the records in three steps: assessment of identification accuracy at the family, genus or species level; current validity of species name; and assignation of inaccurate species records to different categories of classification quality. Synonyms, nonexistent combinations, and outdated combinations were reassigned to currently valid species. A total of 9148 records of Auchenipteridae fishes were analyzed, of which 4165 were from GBIF and 4983 from SpeciesLink, deriving from 46 and 31 sources, respectively. After correcting all possible records following the taxonomic data filtering steps, 6988 records (76.4% of the original) were adequate for describing species distributions, while 2160 remained inaccurate. The most inaccurate records at the species level were due to the use of outdated nomenclatures, resulting in non-valid combinations of species and genus, and synonymy.
Cheers, Eric
Find me on YouTube and Facebook: http://youtube.com/user/Bekateen1; https://www.facebook.com/Bekateen
Buying caves from https://plecocaves.com? Plecocaves sponsor Bekateen's Fishroom. Use coupon code "bekateen" (no quotes) for 15% off your order.