Large-scale sequencing of Expressed Sequence Tags (ESTs) in Arborea aims to discover gene sequences representing diverse biological processes and provide cDNA clones for custom microarray manufacture, accelerate functional investigations of candidate genes, and the development of gene-based genotyping assays. Establishment of collaborations with other programs is an essential component of our program means of extending our sampling of the transcriptome, and enable comparative studies and bring greater value to the scientific community.
Phase 1 – EST sequecing and databasing in spruce and poplar (2002-2006)
In white spruce, we analyzed 70,000 ESTs through 3’ and 5’ sequencing of random of 17 libraries. A unigene set has been resequenced from the 5’ end. Analysis, annotation and databasing are described in Pavy et al. 2005 in BMC Genomics. For poplar, 5’ sequencing was carried out 14, 000 random cDNAs from 7 libraries. High throughput sequencing was carried out by the Genome Sciences Center (Vancouver) and by the Genome Québec – McGill Innovation Center (Montreal).
Access to the sequence information and mining of the data are supported by the databases SpruceDB and ForestTreeDB, and the BioData webpages, all developed in partnership by the Center for Computational Biology and Genomics (Univ. of Minnvesota).
LINK TO: Detailed description of each cDNA library: tissue and treatments, library construction methods, number and direction of reads.
Phase 2 – Toward a complete catalogue of expressed spruce genes (2006-2010)
We aim to sequence 100,000 new cDNA clones from white spruce (Picea glauca), combining 5’ and 3’ reads (150 K reads total). The cDNA from 15-20 cDNA libraries constructed from specialized tissue samples relevant to growth and wood formation that are currently not represented or underrepresented among the spruce EST libraries. In order to move efficiently libraries made from very targeted and diverse tissue samples. Full length sequencing of a large number of cDNAs is also planned.