Analysis of microsatellite and single nucleotide polymorphism within transcriptomic database in Cymbidium ensi folium
Cymbidium ensifolium is one of Cymbidium genus, having elegant shape, beautiful appearance and fragrant aroma. Because of these features, this species gets with extremely high ornamental value. Owing to the lack of its genomic resource, the development and application of molecular marker is still li...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Zhejiang University Press
2014-07-01
|
Series: | 浙江大学学报. 农业与生命科学版 |
Subjects: | |
Online Access: | https://www.academax.com/doi/10.3785/j.issn.1008-9209.2013.08.064 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Cymbidium ensifolium is one of Cymbidium genus, having elegant shape, beautiful appearance and fragrant aroma. Because of these features, this species gets with extremely high ornamental value. Owing to the lack of its genomic resource, the development and application of molecular marker is still limited. With the development of RNA-Seq technology, the transcriptomic data gradually accumulate and become a useful resource to explore marker with low cost and high efficiency.Here, the transcriptome in C. ensifolium was subjected to RNA-seq. Illumina sequencing was performed at Shanghai Majorbio Bio-pharm Biotechnology Co., Ltd. (Shanghai, China) according to the manufacturer's instructions (Illumina, San Diego, CA). High-quality reads were assembled de novo using Trinity with optimized K-mer length of 25. The program Msatcommander was used to analyze microsatellite (as called simple sequence repeat, SSR) frequencies. The minimum numbers of repeats for SSR detection were as follows: six repeats for di- SSRs; and four for tri-, tetra-, penta- and hexa-SSRs. Single nucleotide polymorphisms (SNPs) were detected and filtered using SAMtools and VarScan. The open reading frame (ORF) and untranslated region (UTR) within the isogene were identified using Trinity software. Isogenes containing SSR and SNP were annotated on the basis of BLAST similarity searches.All high-quality reads were assembled into 101 423 isogenes, with total residues of 139 385 689. The isogenes averaged 1 374 bp, ranging from 351 bp to 17 260 bp, and 70 583 isogenes, accounted for 69.60%, were about 600 bp. In total, 17 793 SSRs and 16 676 SNPs were identified within transcriptomic database. The density of SSR and SNP was 1.28 SSRs/10 kb and 1.20 SSRs/10 kb, respectively. Among these SSRs, tri-SSR was the most types, followed by di-SSR, except mono-SSR. Di-SSR and tri-SSR accounted for 20.46% and 21.98% in all SSRs, respectively. The location of SSR was also estimated. The estimated locations were obtained for 7 936 SSRs, but sequence information could not be determined for the remaining 6 586 SSR regions as it extended over both estimated coding and non-coding regions. We found that most tri-SSRs and hexa-SSRs occurred more frequently in coding regions. In contrast, di-SSR, tetra-SSR, and penta-SSR, were more likely to appear in UTR rather than coding regions. Among these SNPs, C/T was the most common base substitution, followed by A/G. The two kinds of substitutions, C/T and A/G, accounted for 30.80% and 28.81% in all SNPs, respectively. The number of isogenes containing SSR and SNP was 13 768 and 7 519, respectively. These isogenes were annotated by Clusters of Orthologous Groups (COG), Gene Ontology (GO) database and Kyoto Encyclopedia of Genes and Genomes (KEGG) database, respectively. A large number of them were annotated with crucial genes that were associated with important biological functions. There were 1 748 SSR and 1 932 SNP isogenes assigned into 23 COG classifications, respectively. There were 4 994 SSR and 4 819 SNP isogenes classified into 80 and 78 GO terms, respectively. There were 2 107 SSR and 2 188 SNP isogenes involved in 300 and 308 KEGG pathways, respectively.The numerous SSRs and SNPs identified in this study will contribute to marker development. The annotation of isogenes containing SSR and SNP will help in constructing genetic maps and exploring the associations between these markers and the interesting traits. The map will in turn accelerate research on genomics and functional genomics of C. ensifolium. |
---|---|
ISSN: | 1008-9209 2097-5155 |