Caveat consumptor notitia museo: Let the museum data user beware

Document Type


Publication Date




Lot accession information from natural history collections represents a potentially vital source of large datasets to test biodiversity, biogeography and macroecology hypotheses. But does such information provide an accurate portrayal of the natural world? We review the many types of bias and error intrinsic to museum collection data and consider how these factors may affect their ability to accurately test ecological hypotheses.


We considered all Texas land snail collections from the two major repositories in the state and compared them with an ecological sample drawn across the same landscape. We found that museum collection localities were biased in favour of regions with higher human population densities and iconic destinations. They also tended to be made during attractive temporal windows. Small, uncharismatic taxa tended to be under-collected while larger, charismatic species were over-collected. As a result, for most species it was impossible to use museum lot frequency to accurately predict frequency and abundance in an ecological sample. Species misidentification rate was approximately 20%, while 4% of lots represented more than one species. Errors were spread across the entire shell size spectrum and were present in 75% of taxonomic families. Contingency table analysis documented significant dependence of both misidentification and mixed lot rates upon shell size and family richness.


Researchers should limit their use of museum record data to situations where their inherent biases and errors are irrelevant, rectifiable or explicitly considered. At the same time museums should begin incorporating expert specimen verification into their digitization programs.


© 2019 John Wiley & Sons Ltd

Publication Title

Global Ecol Biogeogr