SeqBench

Codon Usage Tables — E. coli, Human and Yeast

The genetic code is redundant: most amino acids are encoded by several synonymous codons, and organisms use those alternatives unevenly. This codon usage table shows, for E. coli, human and yeast, what fraction of each amino acid's codons is each synonymous codon. The most-used — the preferred codon — is highlighted for every organism.

Amino acid1-letter3-letterCodonE. coliHumanYeast
AlanineAAlaGCT16%27%38%
AlanineAAlaGCC27%40%22%
AlanineAAlaGCA21%23%29%
AlanineAAlaGCG36%11%11%
ArginineRArgCGT38%8%14%
ArginineRArgCGC40%19%6%
ArginineRArgCGA6%11%7%
ArginineRArgCGG10%21%4%
ArginineRArgAGA4%20%48%
ArginineRArgAGG2%21%21%
AsparagineNAsnAAT45%46%59%
AsparagineNAsnAAC55%54%41%
Aspartic acidDAspGAT63%46%65%
Aspartic acidDAspGAC37%54%35%
CysteineCCysTGT45%45%63%
CysteineCCysTGC55%55%37%
Glutamic acidEGluGAA69%42%70%
Glutamic acidEGluGAG31%58%30%
GlutamineQGlnCAA34%25%69%
GlutamineQGlnCAG66%75%31%
GlycineGGlyGGT34%16%47%
GlycineGGlyGGC40%34%19%
GlycineGGlyGGA11%25%22%
GlycineGGlyGGG15%25%12%
HistidineHHisCAT57%41%64%
HistidineHHisCAC43%59%36%
IsoleucineIIleATT51%36%46%
IsoleucineIIleATC42%48%26%
IsoleucineIIleATA7%16%27%
LeucineLLeuTTA13%7%28%
LeucineLLeuTTG13%13%29%
LeucineLLeuCTT10%13%13%
LeucineLLeuCTC10%20%6%
LeucineLLeuCTA4%7%14%
LeucineLLeuCTG50%40%11%
LysineKLysAAA76%42%58%
LysineKLysAAG24%58%42%
MethionineMMetATG100%100%100%
PhenylalanineFPheTTT57%45%59%
PhenylalanineFPheTTC43%55%41%
ProlinePProCCT16%28%31%
ProlinePProCCC12%33%15%
ProlinePProCCA19%27%42%
ProlinePProCCG53%11%12%
SerineSSerTCT15%18%26%
SerineSSerTCC15%22%16%
SerineSSerTCA12%15%21%
SerineSSerTCG15%6%10%
SerineSSerAGT15%15%16%
SerineSSerAGC28%24%11%
ThreonineTThrACT17%24%35%
ThreonineTThrACC44%36%22%
ThreonineTThrACA13%28%30%
ThreonineTThrACG27%12%13%
TryptophanWTrpTGG100%100%100%
TyrosineYTyrTAT57%43%56%
TyrosineYTyrTAC43%57%44%
ValineVValGTT26%18%39%
ValineVValGTC22%24%21%
ValineVValGTA15%11%21%
ValineVValGTG37%47%19%
Stop*StopTAA61%28%47%
Stop*StopTAG9%20%23%
Stop*StopTGA30%52%30%

green= preferred codon (highest fraction) for that organism· Values are fractions among synonymous codons · Each amino acid sums to ~100%

What codon usage means

Because the genetic code is degenerate, leucine has six codons, isoleucine three, and only methionine and tryptophan have one each. The numbers above are not raw counts: each value is the fraction of that codon among the synonymous codons for the same amino acid, so every amino-acid group sums to roughly 1.0 (100%). This relative measure — similar in spirit to RSCU (relative synonymous codon usage) — is what reveals codon bias: the systematic preference an organism shows for some synonymous codons over others, driven by its tRNA pool, genome composition and selection on highly expressed genes.

Preferred and rare codons

For each amino acid and organism, the codon with the highest fraction is the preferred (optimal) codon— shown in green above. It is the codon a host's most highly expressed genes tend to use, and the one a codon optimiser selects when rewriting a gene. At the other end, codons with a low fraction are rare codons: their matching tRNAs are often scarce, so clusters of them can slow or stall translation, reduce protein yield and occasionally promote misfolding. When a gene is moved into a new host, swapping rare codons for the host's preferred synonyms — without changing the encoded protein — is the core of codon optimisation.

About these values

These fractions are reference approximations derived from the Kazusa Codon Usage Database for each organism and are intended for guidance. Published codon tables differ between sources, releases and the exact gene set they are computed from, so for critical work — synthesising a gene, troubleshooting low expression — you should verify the values against your specific expression system rather than treat any single table as definitive.

Frequently asked questions

What is a codon usage table?
A codon usage table lists, for each amino acid, the synonymous codons that encode it and how often each one is actually used in a given organism's genes. Because the genetic code is redundant — most amino acids have two to six codons — organisms use those alternatives unevenly, and the table captures that preference.
What does the fraction or percentage mean?
Each value is the fraction of that codon among the synonymous codons for the same amino acid, so the codons for any one amino acid sum to about 1.0 (100%). For example, if Leucine's CTG shows 50% in E. coli, half of all leucine codons in E. coli genes are CTG. The number is a relative share within an amino acid, not the codon's frequency across the whole genome.
What is the preferred (optimal) codon?
The preferred codon is the synonymous codon with the highest usage fraction for an amino acid in that organism — the cells highlighted in green in the table. It is the codon a highly expressed gene is most likely to use, and the one a codon optimiser will pick when rewriting a gene for that host.
What are rare codons and why avoid them?
Rare codons are synonymous codons with a low usage fraction in the target organism, often because the matching tRNA is scarce. A run of rare codons can slow or stall the ribosome, lower protein yield and sometimes cause misfolding — so they are usually avoided when expressing a gene in a heterologous host.
Why do E. coli and human codon usage differ?
Codon bias is shaped by each organism's tRNA pool, genome GC content and evolutionary history, so different species favour different synonymous codons. A gene that is well optimised for human cells can contain codons that are rare in E. coli, which is why a sequence often needs codon optimisation before it is expressed in a new host.

See also

Related tools and references

Use these related pages when this table raises a practical calculation or workflow question.