Understanding Peptide Sequences and Amino Acid Nomenclature
Education8 min read2026-01-28

Understanding Peptide Sequences and Amino Acid Nomenclature

An educational primer on how peptide sequences are written, the amino acid code systems, common modifications, and how to interpret the structural shorthand used in research peptide product descriptions and scientific literature.

Research Use Only: All products and compounds discussed on this page are intended for laboratory research purposes only. They are not intended for human consumption, veterinary use, or any form of therapeutic application. Information presented is derived from published scientific literature and does not constitute medical advice.

For researchers entering the field of peptide science, the nomenclature and conventions used to describe peptide structures can be initially overwhelming. This primer covers the essential knowledge needed to read, understand, and communicate about peptide sequences — from the basic amino acid codes to common modifications and structural notation.

The 20 Standard Amino Acids

All naturally occurring proteins and most synthetic peptides are built from the same 20 standard (proteinogenic) amino acids. Each amino acid has a common name, a three-letter abbreviation, and a one-letter code:

Nonpolar (Hydrophobic) Amino Acids

NameThree-LetterOne-LetterKey Feature
GlycineGlyGSmallest amino acid; no side chain
AlanineAlaAMethyl side chain; simple hydrophobic
ValineValVBranched chain; beta-branched
LeucineLeuLBranched chain; commonly in helices
IsoleucineIleIBranched chain; beta-branched
ProlineProPCyclic; introduces rigidity in chain
PhenylalaninePheFAromatic ring; hydrophobic
TryptophanTrpWLargest amino acid; indole ring
MethionineMetMThioether; oxidation-sensitive

Polar Uncharged Amino Acids

NameThree-LetterOne-LetterKey Feature
SerineSerSHydroxyl group; phosphorylation site
ThreonineThrTHydroxyl group; beta-branched
CysteineCysCThiol group; forms disulfide bonds
TyrosineTyrYPhenol ring; phosphorylation site
AsparagineAsnNAmide; deamidation-prone
GlutamineGlnQAmide; deamidation-prone

Charged Amino Acids

NameThree-LetterOne-LetterCharge at pH 7
Aspartic acidAspDNegative (-1)
Glutamic acidGluENegative (-1)
LysineLysKPositive (+1)
ArginineArgRPositive (+1)
HistidineHisH~Neutral (pKa 6.0; partially positive)

Memorization Aid

The one-letter codes are not always intuitive. Some mnemonics:

  • Letters that match: Gly, Ala, Val, Leu, Ile, Pro, Ser, Thr, Cys, His
  • Phonetic connections: F (Phenylalanine sounds like F), W (tryptophan = double-ring, W = double-V)
  • Remaining assignments: D (aspartic acid), E (glutamic acid), K (lysine), R (arginine), N (asparagine), Q (glutamine), M (methionine), Y (tyrosine)

Reading Peptide Sequences

Convention: N-Terminus to C-Terminus

Peptide sequences are always written from left (N-terminus, the amino/NH2 end) to right (C-terminus, the carboxyl/COOH end). This convention mirrors the direction of ribosomal protein synthesis (N to C) and the direction of solid-phase peptide synthesis reading order.

Example: Gly-His-Lys (GHK)

  • Gly is at the N-terminus (left)
  • Lys is at the C-terminus (right)
  • The peptide has two peptide bonds: Gly-His and His-Lys

Three-Letter vs. One-Letter Notation

Both notation systems are widely used:

Three-letter notation: More explicit and less prone to misreading. Used in product descriptions, COAs, and detailed structural discussions.

  • Example: Gly-Glu-Pro-Pro-Pro-Gly-Lys-Pro-Ala-Asp-Asp-Ala-Gly-Leu-Val (BPC-157)

One-letter notation: More compact. Used in databases, bioinformatics, and when space is limited.

  • Example: GEPPPGKPADDAGLV (BPC-157)

Dashes and Notation

  • Dashes between residues (Gly-His-Lys) indicate peptide bonds in three-letter notation
  • No dashes in one-letter notation (GHK)
  • Dashes in one-letter notation sometimes indicate chain breaks or modifications

Common Modifications

Research peptides frequently incorporate chemical modifications that alter their properties. Understanding the notation for these modifications is essential.

Terminal Modifications

Acetylation (Ac- or N-Ac):

  • An acetyl group (CH3-CO-) is added to the N-terminus
  • Protects against aminopeptidase degradation
  • Notation: Ac-Gly-His-Lys or Ac-GHK
  • Effect: Removes the positive charge at the N-terminus, may improve stability

Amidation (-NH2):

  • The C-terminal carboxyl group is converted to an amide (-CONH2)
  • Protects against carboxypeptidase degradation
  • Notation: Gly-His-Lys-NH2 or GHK-NH2
  • Effect: Removes the negative charge at the C-terminus, often improves biological activity

Both modifications together: Ac-Gly-His-Lys-NH2 (an acetylated and amidated tripeptide with no terminal charges)

Non-Natural Amino Acids

Many research peptides incorporate amino acids not found in standard proteins:

D-amino acids:

  • Mirror images of the standard L-amino acids
  • Notation: D-Arg, D-Trp, D-Phe, or using lowercase letters (r, w, f)
  • Effect: Resistant to most proteases, which are stereospecific for L-amino acids
  • Example: Ipamorelin contains D-2-Nal (D-2-naphthylalanine) and D-Phe

Aib (alpha-aminoisobutyric acid):

  • A non-natural amino acid with two methyl groups on the alpha-carbon
  • Promotes helical structure
  • Found in Ipamorelin: Aib-His-D-2-Nal-D-Phe-Lys-NH2

Dmt (2',6'-dimethyltyrosine):

  • A modified tyrosine with methyl groups on the aromatic ring
  • Found in SS-31: D-Arg-Dmt-Lys-Phe-NH2

Orn (ornithine):

  • Similar to lysine but with one fewer carbon in the side chain
  • Sometimes used in peptide design for altered spacing

Disulfide Bonds

Cysteine residues can form covalent disulfide bonds that create loops or connect separate peptide chains:

  • Notation: Parenthetical numbering showing which cysteines are linked
  • Example: A peptide with Cys at positions 3 and 8 forming a disulfide: (Cys3-Cys8)
  • Brackets may also be used: [Cys3-Cys8]

Cyclization

Cyclic peptides have their N-terminus and C-terminus connected:

  • Notation: cyclo- prefix, or brackets: cyclo(Arg-Gly-Asp-D-Phe-Lys)
  • Head-to-tail cyclization creates a ring structure with no free termini
  • Side-chain-to-side-chain cyclization (e.g., lactam bridges) may also occur

PEGylation

Attachment of polyethylene glycol (PEG) chains:

  • Notation: PEG-peptide or peptide-PEG, with PEG molecular weight specified
  • Example: PEG-40K-GHK (GHK with a 40 kDa PEG chain)
  • Effect: Dramatically extends half-life by increasing molecular size and reducing renal clearance

Drug Affinity Complex (DAC)

A specialized modification used in CJC-1295:

  • A maleimidopropionic acid-lysine linker that covalently binds to serum albumin
  • Notation: CJC-1295 DAC or CJC-1295 with Drug Affinity Complex
  • Effect: Extends half-life from minutes to days

Molecular Weight Calculation

For researchers who need to calculate molar concentrations, the molecular weight of a peptide can be estimated:

MW = Sum of residue weights - (n-1) x 18.02

Where:

  • Residue weights are the molecular weights of each amino acid minus water (since water is lost during peptide bond formation)
  • n = number of amino acids
  • 18.02 = molecular weight of water
  • Additional adjustments for terminal modifications, counter ions, etc.

Online peptide molecular weight calculators are readily available and handle modifications automatically.

Counter Ions and Salt Forms

Synthetic peptides are typically supplied as salts. The counter ion affects the gross molecular weight and peptide content:

Trifluoroacetate (TFA) salt:

  • Most common salt form from HPLC purification (TFA is used in the mobile phase)
  • TFA molecular weight: 114.02 Da per TFA molecule
  • Basic residues (Arg, Lys, His, N-terminus) each carry one TFA counter ion
  • Can be exchanged to acetate form if TFA interferes with research applications

Acetate salt:

  • Commonly used alternative to TFA
  • Acetate molecular weight: 59.04 Da per acetate molecule
  • More biocompatible for cell culture and in-vivo applications
  • Produced by salt exchange from TFA form

Hydrochloride (HCl) salt:

  • Used less commonly
  • HCl molecular weight: 36.46 Da per HCl molecule
  • Simple, well-characterized counter ion

Impact on peptide content: A highly basic peptide (multiple Arg/Lys residues) in TFA salt form may have a peptide content of only 60-65%, meaning that 35-40% of the powder mass is TFA counter ions plus residual moisture. This is not an impurity — it is simply the salt form. Researchers must account for this when calculating molar concentrations.

Common Peptide Naming Conventions

Research peptides may be referred to by various names:

  • Systematic name: Based on the amino acid sequence (e.g., Gly-His-Lys)
  • Trade/common name: An informal name used in research communities (e.g., GHK-Cu, BPC-157, Ipamorelin)
  • Code name: An alphanumeric designation from the developing laboratory (e.g., AOD-9604, CJC-1295, SS-31)
  • CAS number: A unique numerical identifier assigned by Chemical Abstracts Service (e.g., 137525-51-0 for BPC-157)

When ordering research peptides, the CAS number is the most unambiguous identifier. Product names and sequence descriptions can vary between vendors.

Conclusion

Peptide nomenclature follows logical conventions that become intuitive with practice. Understanding how sequences are written, what modifications mean, and how salt forms affect the physical product allows researchers to communicate precisely about their materials and interpret vendor product descriptions accurately. When in doubt, the CAS number and the full amino acid sequence (including all modifications) provide the most unambiguous identification of a research peptide.

This article is for educational purposes related to peptide chemistry and research. All peptides discussed are for laboratory research use only and are not intended for human consumption.

Ready to Compare Vendors?

Use our data-driven vendor rankings to find the best source for your research needs.