Reverse Translate Protein to DNA (Back-Translation)
Back-translate a protein to DNA using most-frequent or degenerate IUPAC codons.
🔒 Local processing — pasted sequences are not uploaded
Reverse translate (back-translate) a protein sequence into a DNA coding sequence. Because the genetic code is degenerate, many DNA sequences encode the same protein — so pick one of two strategies: map each residue to the single most-frequent codon in E. coli, human or yeast, or collapse every synonymous codon into one degenerate IUPAC codon (e.g. Leu -> YTN) for designing degenerate oligos. You get the back-translated DNA to copy, a per-residue codon breakdown, and a hand-off to the Codon Optimizer for constraint-aware design.
Result appears hereBack-translation is ambiguous — many DNA sequences encode the same protein. This tool picks the most-frequent codon (or a degenerate consensus) per residue. For constraint-aware design that also considers GC windows, repeats and restriction sites, use the Codon Optimizer.
How to use the Reverse Translate tool
- 1Paste a protein sequence in one-letter amino-acid codes (or load the example).
- 2Choose a mode: most-frequent codon (with an organism) or degenerate IUPAC consensus.
- 3Copy the back-translated DNA and review the per-residue codons, then refine in the Codon Optimizer if needed.
Frequently asked questions
- What is reverse translation (back-translation)?
- Reverse translation converts a protein sequence back into a DNA coding sequence. Because several codons can encode the same amino acid, the result is not unique — this tool resolves the ambiguity either by choosing the most-frequent codon in your organism or by emitting one degenerate IUPAC codon that covers all synonymous codons.
- What is the difference between the most-frequent and degenerate modes?
- Most-frequent picks the single codon used most often for each residue in the selected organism (E. coli, human or yeast), giving a concrete sequence to synthesise. Degenerate mode instead outputs one IUPAC-ambiguity codon per residue (for example Leu becomes YTN, Ser becomes WSN) that matches every synonymous codon — useful for designing degenerate primers or probes.
- How are the degenerate IUPAC codons built?
- For each of the three codon positions, the tool takes the set of bases that appear across all synonymous codons for that amino acid and encodes it as a single IUPAC symbol (e.g. A+G becomes R, A+C+G+T becomes N). The three symbols form the degenerate codon.
- Should I use this to order a synthetic gene?
- Treat it as a starting point. Real gene design also considers GC-content windows, secondary structure, repeats and restriction sites — use the Codon Optimizer for constraint-aware optimisation before synthesis. Codon-usage tables here are reference approximations.