Protein Recommendations

Substitution Variant


a sequence change where, compared to a reference sequence, one amino acid is replaced by one other amino acid.


Format: “prefix”“amino_acid”“position”“new_amino_acid”, e.g. p.(Arg54Ser)

“prefix” = reference sequence used = p.
“amino_acid” = reference amino acid = Arg
“position” = position amino acid subtituted = 54
“new_amino_acid” = new amino acid = Ser


  • prefix reference sequence accepted is “p.” (protein).
  • predicted consequences, i.e. without experimental evidence (no RNA or protein sequence analysed), should be given in parentheses, e.g. p.(Arg727Ser).
  • changes involving two or more consecutive amino acids are described as deletion/insertions (indels) (see Deletion/insertion (delins)).
    NOTE: when either of two directly flanking substitution variants is known as a regularly occurring variant, the variants are described individually and not as a “delins”.
  • a nonsense variant, a variant changing an amino acid to a translation termination (stop) codon, is described as a substitution.
    NOTE: a nonsense variant is not described as a Deletion of the C-terminal end of the protein (e.g. p.Trp26_Arg1623del)
  • a no-stop variant, a variant changing the translation termination codon into an amino acid codon, is described as a extension (Extension).
  • amino acids that have been tested and found not changed are described as p.Cys123= (see SVD-WG001 (no change)).
    NOTE: the underlying DNA change must be given in addition and in this case is either c.456C>T or c.456C=
    NOTE: such changes are silent protein changes
  • the description p.Arg76_Cys77delinsSerTrp is preferred over p.[Arg76Ser;Cys77Trp].
    NOTE: by definition this change can not be described as a substitution (like p.Arg76_Cys77SerTrp)
  • it is not correct to describe “polymorphisms” as p.76Ser/Arg (see Discussions).


  • missense
    • LRG_199p1:p.Trp24Cys
      amino acid Trp24 is changed to a Cys
    • NP_003997.1:p.(Trp24Cys)
      amino acid Trp24is predicted to change to a Cys (no experimental proof, e.g. based on DNA level data)
  • nonsense
    • LRG_199p1:p.Trp24Ter (p.Trp24*)
      amino acid Trp24 is changed to a stop codon (Ter, *)
      NOTE: this change is not described as a deletion of the C-terminal end of the protein (i.e. p.Trp24_Met36853del)
  • silent (no change)
    • NP_003997.1:p.Cys188=
      amino acid Cys188 is not changed (DNA level change ..TGC.. to ..TGT..)
      NOTE: the description p.= means the entire protein coding region was analysed and no variant was found that changes (or is predicted to change) the protein sequence.
  • translation initiation codon
    description depends on the consequences of the change on the translation product (protein);
    • LRG_199p1:p.0
      as a consequence of a variant no protein is produced
    • LRG_199p1:p.? (p.Met1?)
      the consequence of a variant at the protein level are not known (can not be predicted)
  • new translation initiation site
    • upstream - see Extension
    • downstream - NP_003997.1:p.Leu2_Met124del
      a variant causes the inactivation of the normal and activation of a downstream translation initiation site (Met) resulting in deletion of the first 123 amino acids (Met-1 to Val-123) of the protein.
      NOTE: the 3’ rule applies.
  • translation termination codon (stop codon, no-stop change)
    see Extension
  • uncertain
    • NP_003997.1:p.(Gly56Ala^Ser^Cys)
      amino acid Gly56 is changed to an Ala, Ser or Cys (see Uncertain)
  • mosaic
    • LRG_199p1:p.Trp24=/Cys
      a mosaic case where at amino acid position 24 besides the normal amino acid (a Trp, described as “=”) also protein is found containing a Cys (Trp24Cys)
      NOTE: irrespective of the frequency in which each amino acid was found, the reference is always described first
      NOTE: for the predicted consequences of a variant the description is LRG_199t1:p.(Trp24=/Cys)


Are polymorphisms described like p.2366Gln/Lys?

No, all substitutions are described as NP_003997.1:p.Gln2366Lys. In the past, the format p.2366Gln/Lys (p.2366Q/K) has been used to describe "polymorphic" sequence variants. Note that a description should be neutral, simply describe the change, and not include any other information like predicted or known functional consequences.

Can I describe a TrpVal to CysArg variant as a amino acid substitution (p.TrpVal24CysArg)?

No, this is not allowed. By definition a substitution changes one amino acid into one other amino acid. The change TrpVal to CysArg should be described as NP_003997.1:p.Trp24_Val25delinsCysArg, i.e. a deletion/insertion (indel) (see Deletion-Insertion).

How should you describe an amino acid substitution to any other amino acid?

HGVS uses IUPAC symbols (see Standards). The symbol for 'any' amino acid is 'X'/'Xaa'. Since 'X' has been used to indicate a translation stop codon (nonsense variant) we suggest to use 'Xaa' three-letter amino acid code only (e.g. p.Arg782Xaa).