Go Back

Coding Sequences Avoid the Use of "A" in Degenerate Codon Positions

William Seffens

Department of Biological Sciences and Center for Theoretical Study of Physical Systems, Clark Atlanta University, Atlanta, GA 30314

The nucleotide base composition of 51 genes was examined within coding sequences (CDS) and compared to non-coding open reading frames (ORF). Base composition was counted for fixed (usually the first two nucleotides in a codon) and degenerate (usually the third) codon positions. Examining the CDS of the 51 genes, the base composition of "T" is the same between fixed and degenerate positions, while for "A", "C", and "G", a bias was found. The base composition of "A" and "G" is less in degenerate positions than in the fixed positions, while "C" is enriched in degenerate positions in the CDS. The avoidance of "A" and the enrichment of "C" is quite strong. These biases are not found in ORFs of sequences with the same overall base composition. This suggests a means of assisting the identification of genes from ORFs.

Go Back