摘要
Thenumberofcompletelysequencedarchaealgenomeshasbeensufficientforalarge-scalebioinformaticstudy.Wehaveconductedanalysesforeachcodingregionfrom36archaealgenomesusingtheoriginalCGSalgorithmbycalculatingthetotalGCcontent(G+C),GCcontentinfirst,secondandthirdcodonpositionsaswellasinfourfoldandtwofolddegeneratedsitesfromthirdcodonpositions,levelsofargininecodonusage(Arg2:AGA/G;Arg4:CGX),levelsofaminoacidusageandtheentropyofaminoacidcontentdistribution.InarchaealgenomeswithstrongGCpressure,arginineiscodedpreferablybyGC-richArg4codons,whereasinmostofarchaealgenomeswithG+C<0.6,arginineiscodedpreferablybyAT-richArg2codons.InthegenomeofHaloquadratumwalsbyi,whichiscloselyrelatedtoGC-richarchaea,GCcontenthasdecreasedmostlyinthirdcodonpositions,whileArg4>>Arg2biasstillpersists.Proteomesofarchaealspeciescarrycharacteristicaminoacidbiases:levelsofisoleucineandlysineareelevated,whilelevelsofalanine,histidine,glutamineandcytosinearerelativelydecreased.NumerousgenomicandproteomicbiasesobservedcanbeexplainedbythehypothesisofpreviouslyexistedstrongmutationalATpressureinthecommonpredecessorofallarchaea.
出版日期
2010年01月11日(中国期刊网平台首次上网日期,不代表论文的发表时间)