期刊文献+

菠萝密码子使用偏好性分析 被引量:23

Analysis of codon usage bias of Ananas comosus with genome sequencing data
在线阅读 下载PDF
导出
摘要 【目的】分析菠萝[Ananas comosus(L.)Merr.]基因组中编码CDS的密码子使用偏好性,为了解菠萝的密码子偏好性规律和进行分子改造提供理论基础,促进植物密码子的生物学研究。【方法】以菠萝基因组测序获得的30 663条编码CDS为数据来源,应用编写的perl脚本、CUSP和SPSS软件对序列进行密码子偏好性、双联密码子以及多元统计分析。【结果】菠萝基因组数据中的编码CDS的GC平均含量为52.09%,密码子中第3位核苷酸的GC平均含量(GC3S)为55.41%,有效密码子数(ENC)取值为58.41,绝大部分的ENC值都大于35。另外,确定了34种高频密码子(RSCU值大于1),其中仅有8个以AT碱基结尾,25个以CG碱基结尾;同时确定了31种高优越表达密码子。结合以上结果,最后筛选出13种最优密码子。通过与17种植物的GC3S和密码子使用频率进行比较,发现双子叶植物与单子叶植物的GC3S和密码子使用频率存在较大差异,而菠萝较其他单子叶植物与双子叶植物更接近。【结论】从不同基因、基因内不同位置以及不同植物3个层面对菠萝密码子的偏好性进行分析,筛选出13种菠萝最优密码子。该研究有助于更好地了解菠萝密码子偏好性规律,促进植物密码子生物学研究及基因组数据在非模式植物中的潜在应用。 【Objective】Codon usage bias refers to differences in the frequency of occurrence of synonymous codons in coding DNA. A codon is a series of three nucleotides(a triplet) which encodes a specific amino acid residue in a polypeptide chain or for the termination of translation(stop codons). After a long evolution, each species forms its own codon usage patterns. Pineapple [Ananas comosus(L.) Merr.] is a nutrientdense fruit with strong consumer demand and high commericial value. However, little is known about the rules of pineapple codon usage. The aim of the present study was to investigate the pattern utilization of codons in genome sequencing data of pineapple in order to provide important guidance for genetic transformation, new gene discovery, functional gene expression regulation, protein structure and function prediction of genes, comparative genomics research with other species and molecular breeding in pineapple.【Methods】Data were obtained by JGI database, we analyzed the 30 663 genes in genome sequencing data of pineapple to study the pattern utilization of codons by perl script, and SPSS bioinformatics softwares, by which CG, Effective number of codon(ENC), Relative synonymous codon usage(RSCU) and double codon werecaculated. The RSCU value was the relative probability of a codon encoding the same amino acid for a particular codon. In the absence of codon usage preference, the RSCU of each synonymous codon was 1.When the RSCU of a codon was over 1, the codon was defined as a high frequency codon, indicating that the codon had a higher frequency of use in a synonymous codon and that the gene had a preference for the codon. The ENC value described the degree to which codon usage is deviated from random selection. ENC could reflect the degree of preference for synonymous codon usage in the codon family. The smaller the ENC value was, the higher the expression level of the corresponding endogenous gene was. According to the size of the ENC of each gene, the values of RSCU of the genes in high and low expression levels were obtained. If the RSCU difference between the high and low expression genes was over 0.08, then the corresponding codon for the amino acid was determined to be a high-expression superior codon. If the codon was simultaneously determined to be a high frequency codon and a high expression superior codon, the codon was the optimal codon. The pineapple genes were imported into CUSP software for calculation, and then the codon usage frequencies were obtained. The genome data of Carica papaya, Glycine max, Arabidopsis thaliana, Ricinus communis, Prunus mume, Prunus persica, Cucumis sativus, Cajanus cajan, Oryza sativa, Brassica rapa, Carica papaya, Citrus sinensis, Brachypodium distachyo, Populus trichocarpa, Theobroma cacao, Vitis vinifera, Sorghum bicolor and Zea mays were searched through the JGI database. The gene codon usage frequencies of pineapple were compared with those of other species. If the difference of the frequencies between two species were in the range of 0.5-2.0, the codon preference of the two species was relatively close.【Results】The GC content of pineapple genes was 52.09%, the GC content in the third positions was 55.41%, which indicated the GC3Scontent(the GC content of the third nucleotide of synonymous codon) of pineapple genes had no obvious codon usage bias(CUB). The ENC of whole genes was58.41, the majority of the ENC values were over 35, indicating that the pineapple transcriptome gene CUB was weak. In addition, it was determined that the RSCU of the 34 codons was over 1, they were defined as high frequency codons(CTC, TTG, CTT, AGG, CGC, AGA, CGG, TCC, TCT, AGC, TCG, GTG, GTT, GTC,GGC, GGG, ACC, ACT, CCT, CCG, CCC, ATC, ATT, GCC, GCG, TGC, AAG, GAG, TTC, GAT, TAC,CAG, CAC, AAT), only 8 of them ended with AT base and 25 of them ended with GC base, which indicated tthat the pineapple gene codons preferred to the end of C or G, at the same time. 31 high-quality expression codons were obtained through analysis, 13 optimal codons were identified on the above basis. They were AGG, AGA, TCT, CTT, TTG, GTT, CCT, ACT, ATT, GAT, AAT, TTT and TAT. In addition, we also analyzed the sequence of codons with 20 amino acid pair codons. We found that the codon usage patterns of the monocotyledons plants gene were greatly different from those of the dicotyledonous plant genes through comparison with other 17 specise, pineapple is closer to dicotyledonous plants.【Conclusion】Eighteen optimal codons were selected through the analysis of codon bias of Ananas comosus, which would provide a basis for gene optimization and prediction of some function unknown genes in pineapple.
出处 《果树学报》 CAS CSCD 北大核心 2017年第8期946-955,共10页 Journal of Fruit Science
基金 国家自然科学基金(31260460) 海南省重点研发项目(ZDYF2016035)
关键词 菠萝 基因组 密码子偏好性 GC ENC RSCU 最优密码子 Ananas comosus Genome sequencing data Codon usage bias GC ENC RSCU Optimal codons
  • 相关文献

参考文献5

二级参考文献61

  • 1赵胜,张琴,廖伟璇,何凯.41条牦牛CDS序列的密码子偏好性分析[J].西南民族大学学报(自然科学版),2005,31(5):755-760. 被引量:8
  • 2田清震,谢传晓,李新海,李明顺,张世煌.玉米基因组学研究进展[J].玉米科学,2006,14(3):1-5. 被引量:10
  • 3何业华,罗吉,吴会桃,王瑞霞,高爱平,赵春香,余小玲,叶自行,王泽槐,韩景忠,刘和平.菠萝叶基愈伤组织诱导体细胞胚[J].果树学报,2007,24(1):59-63. 被引量:27
  • 4Wen-Juan Zhang,Jie Zhou,Zuo-Feng Li,Li Wang,Xun Gu,Yang Zhong.Comparative Analysis of Codon Usage Patterns Among Mitochondrion, Chloroplast and Nuclear Genes[J].Journal of Integrative Plant Biology,2007,49(2):246-254. 被引量:59
  • 5林庆光,崔百明,彭明.SERK基因家族的研究进展[J].遗传,2007,29(6):681-687. 被引量:11
  • 6Carlini D.B.,Chen Y.,and Stephan W.,2001,The relationship between third-codon position nucleotide content,codon bias,mRNA Secondary structure and gene expression in the drosophild alcohol dehydrogcnase genes A dh and A dhr,Genetics,159(2):623-633.
  • 7Gupta S.K.,Majumdar S.,Bhattacharya T.K.,and Ghosh T.C.,2000,Studies on the relationships between the synonymous codon usage and protein secondary structural units,Biochemical and Biophysical Research Communications,269(3):692-696.
  • 8Kawabe A.,and Miyashita N.T.,2003,Patterns of codon usage bias in three dicot and four monocot plant species,Genes &Genetic Systems,78(5):343-52.
  • 9Liu Q.P.,Feng Y.,Zhao X.,Dong H.,and Xue Q.,2004,Synonymous codon usage bias in Oryza sativa,Plant Science,167:101-105.
  • 10Lynn D.J.,Singer G.A.,and Hickey D.A.,2002,Synonymous codon usage is subject to selection in Thermophilic bacteria,Nucleic Acids Research,30(19):4272-4277.

共引文献86

同被引文献270

引证文献23

二级引证文献148

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部