Fundamentals

Understanding Peptide Sequence Notation

Last updated 2026-06-21

How peptide sequences are written and interpreted in research: one-letter and three-letter codes, the N-to-C convention, and why notation precision matters for records.

What sequence notation is

A peptide is defined by the order in which its amino acids are connected, and sequence notation is the systematic way of writing that order down. Because the sequence determines what the molecule is, notation is the foundation of how research materials are named and identified. The overview below describes the conventions in use and why they matter; it does not provide guidance on any application or use of a material.

Notation systems are standardised so that a sequence written by one researcher is interpreted identically by another, regardless of the field or institution. Without that consistency, the same peptide could be described in multiple incompatible ways, making records and orders unreliable. Standard notation eliminates that ambiguity.

One-letter and three-letter codes

Three-letter codes

Each of the twenty standard amino acids has an internationally agreed three-letter abbreviation derived from its name. Glycine is Gly, alanine is Ala, leucine is Leu, serine is Ser, and so on. Writing a sequence in three-letter code is readable and unambiguous, making it well suited to formal documentation, specifications, and scientific publications where clarity is more important than brevity. The three-letter format is widely used in characterisation documents and research references, and it allows a reader to confirm each residue without consulting a code table.

One-letter codes

For longer sequences, three-letter notation becomes impractical. A single uppercase letter is assigned to each standard amino acid: G for glycine, A for alanine, L for leucine, S for serine, and so on. One-letter notation lets a complete sequence be written compactly as an unbroken string of letters, which is useful in databases, sequence files, and wherever space is limited. Both formats convey the same information; they are simply different representations of the same defined sequence.

The N-to-C direction convention

A peptide chain has two chemically distinct ends. The N-terminus carries a free amino group, and the C-terminus carries a free carboxyl group. By universal convention, sequences are always written from left to right starting at the N-terminus and ending at the C-terminus. This direction is not arbitrary: the two termini are chemically different, so a sequence read in the reverse direction describes a distinct molecule. Consistently following the convention means the same material is described the same way across every record, reference, and order.

When ordering or referencing a research peptide, confirming that the sequence is read in the standard N-to-C direction prevents misidentification. For background on the chemistry of the peptide bond and why direction matters at the molecular level, see What Is a Peptide?, and for how identity fields appear on product listings, see the catalogue.

Modifications and non-standard residues

Many research peptides carry chemical modifications beyond the standard twenty amino acids. Common examples include N-terminal acetylation, C-terminal amidation, and side-chain modifications such as phosphorylation or the attachment of a fluorescent label. Standard one-letter or three-letter notation does not capture these modifications, so extended notation conventions and explicit descriptors are used alongside the sequence string to give a complete, unambiguous identity.

Non-standard amino acids, sometimes called non-canonical amino acids, also appear in research materials. Because no single universally agreed code exists for most of them, they are described explicitly in the material name and specification. Where a material contains a non-standard residue, the specification sets this out clearly so the identity remains unambiguous and comparable across records.

How notation supports identification

Precise sequence notation is what allows a named peptide to be distinguished from any other. Two peptides made from the same amino acids but joined in a different order are different molecules, and their sequences written in standard notation are visibly different strings. This makes notation the primary means of ensuring that documentation, orders, and labels all refer to the same defined material. For how amino acids are classified and what their properties contribute to a sequence, see Amino Acid Classifications in Peptide Research.

Notation in specifications and records

On a research material specification, the sequence or name of a peptide is an identity field. It should match the material name as recorded in laboratory documentation, order references, and certificates of analysis. Keeping notation consistent across all of these supports traceability over time and reduces the risk of confusing related but distinct materials. For how identity fields sit alongside analytical data in a specification, see Understanding Research Material Specifications.

Where materials include modifications, the full description of those modifications forms part of the identity. Recording the complete name or sequence string, including any modification descriptors, at the point of receipt ensures the laboratory record precisely matches what was supplied. This is particularly relevant where several related peptides with similar sequences but different modifications are used within the same research programme, since the differences may not be obvious from names alone.

Why notation conventions matter for research

Research depends on reproducibility, and reproducibility depends on clear identity. Notation conventions exist so that any researcher reading a sequence in standard form arrives at the same molecule. When notation is inconsistent or ambiguous, material can be confused with a related but distinct compound, which undermines the value of both the results and the records behind them. Adopting and maintaining standard notation throughout ordering, receipt, and use is a straightforward step that has lasting value for the integrity of laboratory documentation. Our general approach to material identification is described on the Quality page, and the scope of supply for all materials is set out in our Research use statement.