Integrated Publishing Toolkit(IPT)

free and open access to biodiversity data
  • The version you are looking at is not the latest version! Please refer to the versions table to find the latest.

Centre for Biodiversity Genomics - DNA for Canadian Specimens

Version 1.8 published by University of Guelph on May 3, 2018 University of Guelph

The Centre for Biodiversity Genomics (CBG) at the University of Guelph is spearheading a novel approach to biodiversity research within Canada and internationally. Its three research units -- CBG Collections, CBG Genomics (or the Canadian Centre for DNA Barcoding, CCDB) and CBG Informatics -- are advancing 21st century biodiversity science by enabling species identification and discovery that is based on the analysis of sequence diversity in short, standardized gene regions, DNA barcodes. CBG Collections maintains a globally unique natural history collection of 3.3 million specimens. Every specimen is digitized, and the exact storage location of each specimen is tracked in a collection management information system for quick reference and retrieval. The databased information for every voucher is also archived in the Barcode of Life Data System (BOLD; www.boldsystems.org), permitting the permanent storage, validation and analysis of barcode sequence data and associated specimen metadata. Most (88.6%) of the specimens have been DNA barcoded, and a few representatives of every species have been digitally imaged. The CCDB holds high quality DNA extracts in a secure 2000 ft2 ultra-cold freezer bank. These DNA extracts reflect residual material following the barcode analysis of samples; it contains 5.3 million extracts from over 250,000 species derived from 231 countries, oceans and dependent territories, all connected to a voucher specimen on BOLD. This resource represents extractions held in the DNA Archive of the Canadian Centre for DNA Barcoding. They are derived from the Canadian specimens held in CBG Collections. Please direct inquiries to info@ccdb.ca

Data Records

The data in this occurrence resource has been published as a Darwin Core Archive (DwC-A), which is a standardized format for sharing biodiversity data as a set of one or more data tables. The core data table contains 1,500,515 records. 6 extension data tables also exist. An extension record supplies extra information about a core record. The number of records in each extension data table is illustrated below.

  • Occurrence (core)
    1500515
  • Amplification 
    1500515
  • Loan 
    1500515
  • MaterialSample 
    1500515
  • Permit 
    1500515
  • Preparation 
    1500515
  • ResourceRelationship 
    1500515

This IPT archives the data and thus serves as the data repository. The data and resource metadata are available for download in the downloads section. The versions table lists other versions of the resource that have been made publicly available and allows tracking changes made to the resource over time.

Downloads

Download the latest version of this resource data as a Darwin Core Archive (DwC-A) or the resource metadata as EML or RTF:

Data as a DwC-A file download 1,500,515 records in English (127 MB)  - Update frequency: as needed
Metadata as an EML file download in English (21 KB)
Metadata as an RTF file download in English (13 KB)

Versions

The table below shows only published versions of the resource that are publicly accessible.

How to cite

Please be aware, this is an old version of the dataset.  Researchers should cite this work as follows:

Telfer A, Bessonov K, Zakharov EV and deWaard JR (2018): Centre for Biodiversity Genomics - DNA for Canadian Specimens. v1.8. University of Guelph. Dataset/Occurrence. https://ipt.uoguelph.ca/ipt/resource?r=public_data&v=1.8

Rights

Researchers should respect the following rights statement:

The publisher and rights holder of this work is University of Guelph. This work is licensed under a Creative Commons Attribution Non Commercial (CC-BY-NC) 4.0 License.

GBIF Registration

This resource has not been registered with GBIF

Keywords

Occurrence; Specimen

Contacts

Who created the resource:

Angela Telfer
Data Management Lead - Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA 15198244120
http://biodiversitygenomics.net/

Who can answer questions about the resource:

Jeremy deWaard
Associate Director - Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA 15198244120
http://biodiversitygenomics.net/
Evgeny Zakharov
Associate Director - Genomics
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA 15198244120
http://biodiversitygenomics.net/

Who filled in the metadata:

Angela Telfer
Data Management Lead - Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA 15198244120
http://biodiversitygenomics.net/

Who else was associated with the resource:

Owner
Jeremy deWaard
Associate Director - Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA 1519824412058125
Custodian Steward
Angela Telfer
Data Management Lead - Collections
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA
Programmer
Kyrylo Bessonov
Postdoctoral Fellow – Genomics
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA
Owner
Evgeny Zakharov
Associate Director – Genomics
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA
Content Provider
Suresh Naik
Research Scientist / Curator of DNA Archive – Genomics
Centre for Biodiversity Genomics, University of Guelph 50 Stone Road East N1G 2W1 Guelph Ontario CA

Geographic Coverage

All specimens were collected in Canada.

Bounding Coordinates South West [42.812, -149.062], North East [83.195, -48.164]

Taxonomic Coverage

Specimens were identified to species where possible by taxonomic experts. As we operate a high-throughput facility with an influx of approximately 1 million specimens every year, we also rely on semi-automated and automated methods of assigning taxonomy to specimens. Using DNA barcode sequences from BOLD Systems (BOLD; www.boldsystems.org), these remaining specimens were assigned a taxonomic name based on sequence similarity to named specimens. Two different procedures were used to assign names based on DNA barcode sequences: 1)CollectionsID - Taxonomic names were assigned to all specimens found in the same Barcode Index Number (BIN). An advanced algorithm is used to cluster BINs based on sequence similarity and BINs show high concordance with species (see Ratnasingham and Hebert 2013 https://doi.org/10.1371/journal.pone.0066213). CollectionsID scans the existing database of named specimens, and assigns taxonomic information whenever a BIN match is found. Applies to : Phylum- Species level ID 2)BOLD ID Engine (Manual) – BOLD stores sequence information for every barcoded organism. The BOLD ID Engine compares the sequence similarity of your chosen specimen against the other sequences in the database, and returns the taxonomic information for the best matches. We then assign taxonomy based on the following rule set: a.Order to Family identification: Assigned when class, order, or family name has greater than 80% sequence similarity to listed results b.Genus identification: Assigned when genus name has greater than 95% sequence similarity to listed results c.Species identification: Assigned when species name has greater than 98% sequence similarity to listed results (indicating specimens are from the same BIN) If there are conflicts present in the result (i.e. more than one family name present and having greater than 80% sequence similarity), we assign names when it is overwhelmingly clear one name is more prevalent and is likely the correct taxonomic name (i.e. 20 Tipulidae listed, 1 Limoniidae, we would assign Tipulidae). This is a judgment call made by trained senior staff of the CBG Collections Unit.

Phylum  Arthropoda,  Chordata,  Annelida,  Mollusca,  Cnidaria,  Platyhelminthes,  Brachiopoda,  Bryozoa,  Echinodermata,  Hemichordata,  Nematoda,  Nemertea,  Porifera,  Priapulida,  Rhodophyta,  Tardigrada

Temporal Coverage

Start Date / End Date 1981-01-01 / 2017-04-30

Project Data

The specimens/DNA in this resource are a composite of many collecting events undertaken by the Centre for Biodiversity Genomics staff and associated parties. All are currently held in the Specimen and DNA Archives of the Centre for Biodiversity Genomics. We wish to thank numerous BOLD project managers whose records are held in our Archive and granted permission to publish their data as part of this dataset: Adriana E. Radulovici Alex Smith Alex V. Borisenko Anthony W. Thomas Beth Clare Beverly Mcclenaghan Dirk Steinke Donald J. Baird E. Anne Chambers Harpreet Singh Ghator Jeffrey M. Webb Kara K. S. Layton Nicholas Jeffery Sarah Adamowicz Monica R. Young

Title Centre for Biodiversity Genomics - Canadian Specimens
Funding National Science and Engineering Research Council (NSERC), Genome Canada, Ontario Genomics Institute, Ontario Ministry of Research and Innovation, the Canadian Foundation for Innovation, and the McCain/Evans Foundation.
Design Description Collections and acquisitions are aimed at gathering a synoptic collection of voucher specimens of non-endangered invertebrates which occur in protected areas across North America for subsequent molecular analyses at the Center for Biodiversity Genomics and identification by taxonomic experts.

The personnel involved in the project:

Owner
Jeremy deWaard
Owner
Evgeny Zakharov
Programmer
Kyrylo Bessonov
Content Provider
Suresh Naik
Custodian Steward
Angela Telfer

Sampling Methods

Please see individual data records for complete collection details.

Study Extent Sampling frequency varies according to project, year and location of collection. Please see individual data records for complete collection details.
Quality Control All specimens are visible on BOLD (www.boldsystems.org). Through comparison with other specimens using their DNA barcode sequences, contaminated specimens and misidentifications were discovered and fixed where possible. All fields underwent a data cleansing process to ensure data were entered in a standardized matter.

Method step description:

  1. Please see individual data records for complete collection details.

Collection Data

Collection Name Centre for Biodiversity Genomics
Collection Identifier BIOUG
Specimen preservation methods Alcohol,  Deep frozen,  Dried,  Mounted,  Pinned

Additional Metadata

Purpose The Centre for Biodiversity Genomics (CBG) at the University of Guelph maintains a globally unique natural history collection and is spearheading a novel approach to biodiversity research within Canada and internationally. To maintain continued accessibility of this digitized collection, the CBG aims to liberate more data and derivatives of the specimens it holds.
Alternative Identifiers https://ipt.uoguelph.ca/ipt/resource?r=public_data