Protein Recommendations

Extension Variant


Definitions

Extension
a sequence change in the translation initiation (start) or translation termination (stop) codon extending the normal translational reading frame at the N- or C-terminal end with one or more amino acids.

Description

Format (N-terminal): “prefix”“Met1”“ext”“position_new_initiation_site”, e.g. p.Met1ext-5

“prefix” = reference sequence used = p.
“Met1” = normal translation initiation site = Met1
“ext” = type of change is an extension = ext
“position_new_initiation_site” = position new upstream translation initiation site = -5

Format (C-terminal): “prefix”“Ter_position”“new_amino_acid”“ext”“position_new_termination_site”, e.g. p.Ter110Glnext*17

“prefix” = reference sequence used = p.
“Ter_position” = normal translation termination site = Ter110
“new_amino_acid” = amino acid encoded by variant termination codon = Gln
“ext” = type of change is an extension = ext
“position_new_termination_site” = position new downstream translation initiation site = *17


Note

  • prefix reference sequence accepted is “p.” (protein).
  • predicted consequences, i.e. without experimental evidence (no RNA or protein analysed), should be given in parentheses, e.g. p.(Ter110Glnext*17).
  • prefix reference sequences accepted are p. (protein).
  • extension variants have been accepted on 2012-08-31
  • variants affecting the translation initiation site (Met1) activating a downstream (C-terminal) translation initiation site are described as a deletion, e.g. p.Gly2_Met46del.
  • only variants that directly affect the normal translation termination codon are described as protein extensions (C-terminal). Variants affecting upstream (N-terminal) sequences extending the sequence beyond the normal translation termination codon are described as frame shifts.

Examples

  • p.Met1ext-5
    a variant in the 5’ UTR activates a new upstream translation initiation site starting with amino acid Met-5
    NOTE: modified from p.Met1extMet-5
  • p.Met1Valext-12
    amino acid Met1 is changed to Val activating an upstream translation initiation site at position -12 (Met-12)
    NOTE: modified from p.Met1ValextMet-12
  • p.Ter110Glnext*17 (alternatively p.*110Glnext*17)
    a variant in the stop codon (Ter/*) at position 110, changing it to a Gln-codon (a no-stop variant) and adding a tail of new amino acids to the protein’s C-terminus, ending at a new stop codon at position *17
    NOTE: p.Ter110GlnextTer17 is not correct, “Ter17” does not indicate a “position” but “amino acid 17”, only “*” is correct
  • p.(Ter315TyrextAsnLysGlyThrTer) (alternatively p.*315TyrextAsnLysGlyThr*)
    a variant in the stop codon (Ter/*) at position 315, changing it to a Tyr-codon (a no-stop variant) and adding a tail of new amino acids to the protein’s C-terminus, ending at a new stop codon (Ter5/*5)
  • p.Ter327Argext*? (alternatively p.*327Argext*?)
    a variant in the stop codon (Ter/*) at position 327, changing it to an Arg-codon and adding a tail of new amino acids of unknown length (position *?) since the shifted frame does not contain a new stop codon.
    NOTE: added on 2012-11-01

Q&A

How are variants at the protein level called that directly affect the translation initiation (start) codon?

The variant is called start-lost variant, one of two types of a protein extension, an N-terminal extension. Note the difference with a start-gained variant where the start codon itself is not directly affected, another type of N-terminal extension.

How are variants at the protein level called that directly affect the translation termination (stop) codon?

The variant is called a no-stop or stop-lost variant, one of two types of a protein extension, a C-terminal extension.

How do I describe an extension when no new stop codon is reached?

Such variants are described using the format p.Ter789ArgextTer?, i.e. "extTer?" to indicate that no new termination codon is encountered.

How should a variant in the 5'UTR be described that gives rise to a new translation initiation site?

Description at the DNA-level is like c.-23A>T (changing c.-25 caGggt c.-19 to caTggt, creating a new ATG-triplet). Description at the RNA-level is r.-23a>u and at the protein level p.(Met1ext-8), indicating the predicted protein sequence is an N-terminal extension with 8 amino acids.

Should I describe a duplication in the translation termination codon (TGA to TGGA) as a frame shift or as an extension?

The variant is in the translation termination codon and therefore by definition an extension.