|
Version: 2.0
Template Description: Template for EST DataTemplate for EST Data
Mappings for this template
Sections available in this template
| Section Name | Description | Conditions |
| Source | Information on the source of the dataset, the species it concerns and the name and version of the dataset | Mandatory
|
| Experiment | General experiment data | Mandatory
|
| Quality Assessment | Information about the quality measures used | Mandatory
|
| Conditions | Experimental conditions | Mandatory
|
| Summary | Description of the library containing the ESTs identified in the experiment. | Mandatory
|
| ESTs | Information on individual ESTs. | Optional Multiple sheets allowed
|
| Sequence Data | Information regarding the nucleotide sequences of the ESTs. | Optional Multiple sheets allowed
|
| Samples | Information about samples used in the experiment | Mandatory Multiple sheets allowed
|
| Institutions | List of institute codes used in passport data sections and their corresponding decoded name and addresses. | Optional
|
Source
Section Description: Information on the source of the dataset, the species it concerns and the name and version of the dataset
see section: source in GCPDataSubmissionTemplate2.0
for the following fields
institute, principalInvestigator, projectCode, projectName, emailContact, species, ploidy, datasetName, version, creationDate, remark
Experiment
Section Description: General experiment data
see section: experiment in GCPGenotypingTemplate2.0
for the following fields
operationalTaxonomicUnit, purposeOfStudy, missingData, remark
Quality Assessment
Section Description: Information about the quality measures used
see section: qualityAssessment in GCPDataSubmissionTemplate2.0
for the following fields
qualityMeasure, standard, control, errorEstimator
Conditions
Section Description: Experimental conditions
see section: conditions in GCPGenotypingTemplate2.0
for the following fields
samplingStrategy, controlGenotypes, rnaExtraction, cDnaSynthesis, libraryConstruction, sequencingMethod, reference
| Field Name | Description | Conditions |
| RNA Extraction | A description of the RNA extraction method or reference to published method. | Mandatory
|
| cDNA Synthesis | A description of the protocol for cDNA synthesis or reference to published method. | Mandatory
|
| Library Construction | A description of the method used for cDNA library construction or reference to published method. | Mandatory
|
| Sequencing Method | A description of the method used for sequencing of the cDNA library or reference to published method. | Mandatory
|
Summary
Section Description: Description of the library containing the ESTs identified in the experiment.
| Field Name | Description | Conditions |
| Library Name | The used-defined name of the library. | Mandatory
|
| Data Source | Name of database where the ESTs were submitted.
Example: NCBI GenBank | Mandatory
|
| GeneBank Accession Numbers | Accession numbers of the EST sequences in the database.
Example: AB123456;AF000987 | Mandatory
|
| Number of ESTs | Total number of ESTs in the library. | Mandatory
|
| Creation Date in Database | Date library was created in the database. | Mandatory
|
| Species | Taxonomic or common name of species analyzed. If data is from several species, please separate multiple entries separate each species name with a semi-colon. | Mandatory
|
| Tissue Type | Tissue type and organ source.
Example: root | Mandatory
|
| Cell Type | Cell type and name of cell line, if applicable. | Optional
|
| Developmental Stage | Developmental stage of organism during sampling.
Example: juvenile | Mandatory
|
| Laboratory Host | Laboratory host of library. | Optional
|
| Vector | Name and type of vector used to construct library. | Mandatory
|
| Restriction Enzyme 1 | Restriction enzyme at site 1 of vector. | Optional
|
| Restriction Enzyme 2 | Restriction enzyme at site 2 of vector. | Optional
|
| Protocol Description | Description of library preparation methods. | Mandatory
|
| Reference | One or more references to articles in which the genotyping procedures are published. Please place each reference on a separate row in the same column. | Optional
|
ESTs
Section Description: Information on individual ESTs.
see section: markers in GCPMappingTemplate2.0
for the following fields
estID, sampleID, geneBankAccessionNumber, contig, cluster, length, sequenceIDs, numberOfSsr, ssrDetected, forwardPrimer, reversePrimer, annealingTm, productLength, references
| Field Name | Description | Conditions |
| EST ID | A unique identifier of the EST. The EST ID will be unique for a specific laboratory but is not a universal identifier. It must relate to an EST ID in the list of individual ESTs. | Mandatory Unique
|
| Sample ID | A unique identifier of a DNA sample, which can be a sample in a well on a gel or a LIMS entry, or even a unique ID created specifically for this dataset. The SampleID is specific to a lab and is not a universal identifier. If the accession data is provide it must relate to SampleID in the accession sheet or file. | Mandatory Unique
|
| GeneBank Accession Number | Accession number of the EST sequence in the database.
Example: AB123456 | Mandatory
|
| Contig | Identifier of the contig.
Example: Contig1 | Optional
|
| Length | Number of base pairs in the EST. | Optional
|
| Cluster | Identifier for the cluster. | Optional
|
| Sequence IDs | List of sequences in the cluster. | Optional
|
| Number of SSR | Number of SSRs in the EST. | Optional
|
| SSR Detected | List of tandem repeats in the cluster. | Mandatory
|
| Product Length | Number of base pairs in the PCR product of the primer pair. | Mandatory
|
Sequence Data
Section Description: Information regarding the nucleotide sequences of the ESTs.
| Field Name | Description | Conditions |
| Sequence ID | Unique identifier of the sequence. | Mandatory
|
| GeneBank Accession Number | Accession number of the sequence in the database.
Example: AB123456 | Mandatory
|
| SSR ID | Unique identifier of the SSR. | Mandatory
|
| Motif Length | Number of basepairs in the motif. | Mandatory
|
| Motif | Repeated motif in the SSR. | Mandatory
|
| Number Of Repeats | Number of times motif is repeated. | Mandatory
|
| SSR Start | Start site of SSR in the sequence. | Mandatory
|
| SSR Stop | Stop site of SSR in the sequence. | Mandatory
|
| Sequence Length | Number of base pairs in the sequence. | Mandatory
|
| Putative Annotation | Putative annotation for the sequence. | Optional
|
| Sequence | Actual sequence. | Mandatory
|
| Remarks | Any additional information supporting the data that the authors / curators want to add. Multiple references should be separated with a semi-colon | Optional
|
Samples
Section Description: Information about samples used in the experiment
Samples (optional)
The first field in the sample is the SampleID, which relates directly to the SampleID field in the data spreadsheet or file. This SampleID is a unique identifier of a DNA sample, which can be a sample in a well on a gel or a LIMS entry. It could even by a unique identifier developed specifically for this dataset. In the case of multiple extractions from the same material then each same would have a unique SampleID. Please refer to the section on Multiple Data Points for more details.
The GermplasmID field is an optional field for collections where a new GermplasmID is assigned each time an accession is regenerated or for some other reason a new seed or germplasm sample is taken. For this reason an accession in this case is a collection of samples with different GermplasmIDs. GermplasmID are often unique within a specific database for this reason they should be prefixed by the data name or abbreviation. For example, an entry with GermplasmID 2341 in IWIS, would be IWIS:2341.
The remaining accession data should be either in multi-crop passport descriptors (MCPD) or EURISCO descriptors format. These descriptors are MCPD defines a total of 28 descriptors for passport data, each of which equates to a column in the template. EURISCO defines an additional 6 descriptors for a total of 33 descriptors. Only a few MCPD or EURISCO descriptors are mandatory and for the sake of brevity only the mandatory and some recommended optional fields are described here. However, the mandatory descriptor provides sufficient information to allow the accession to be found in the appropriate National Inventory or genebank. For a full description of all MCPD and EURISCO descriptors please refer to the EURISCO_Descriptors.doc file, which is available fro the EPGRIS website (<a href=�http://www.ecpgr.cgiar.org/epgris/�>http://www.ecpgr.cgiar.org/epgris/</a>) and or can be downloaded with the passport template.
see section: generalPassportData in GCPPassportTemplate2.0
for the following fields
sampleID, sampleGermplasmID, localUniqueID, holdingInstitute, collectionName, genus, species, countryOfOrigin
| Field Name | Description | Conditions |
| Sample ID | A unique identifier of a DNA sample, which can be a sample in a well on a gel or a LIMS entry, or even a unique ID created specifically for this dataset. The SampleID is specific to a lab and is not a universal identifier. If the accession data is provide it must relate to SampleID in the accession sheet or file. | Mandatory Unique
|
| Germplasm ID | A alphanumeric value which uniquely identifies the germplasm. The format proposed is concatenation of HoldingInstitute:CollectionName:LocalUniqueID. In case a new Germplasm ID is assigned each time an accession is regenerated or for some reason sub-sampled use the current germplasm ID prefixed with the system or database name.
Example: NGA333:Genebank:252
Example: COL003:CIATBEAN:3542
Example: MEX064:IWIS:2341 | Mandatory Unique
|
| Country of Origin | Code of the country in which the sample was originally collected. Use 3-letter ISO 3166-1 extended country codes. | Optional
|
Institutions
Section Description: List of institute codes used in passport data sections and their corresponding decoded name and addresses.
see section: institutions in GCPDataSubmissionTemplate2.0
for the following fields
faoInstituteCode, organizationName, street, cityState, zipCode, country, institutionalEmail, institutionalTelephone, fax, url, primaryContactName
Copyright (c) 2004-2006 Bioversity, CIMMYT, IITA, IRRI
Developed by Guy Davenport (CIMMYT), Sarah Hearne (IITA), Mathieu Rouard (Bioversity), Genevieve Aquino (IRRI)
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.
|