Protein Recommendations

Alleles Variant


Definitions

Allele
a series of variants in a protein encoded by one chromosome.

Description

Format (one allele): “prefix”[“variant1”;”variant2”], e.g. p.[Ser73Arg;Asn103del]

  • “prefix” = reference sequence used = p.
  • [ = opening symbol for allele = [
  • “variant1” = description first variant = Ser73Arg
  • ; = separator symbol two changes = ;
  • “variant2” = description second variant = Asn103del
  • ] = closing symbol for allele = ]

Format (two alleles): “prefix”[“variant”];[“variant”], e.g. p.[Ser73Arg];[Asn103del]

  • “prefix” = reference sequence used = p.
  • [ = opening symbol for allele-1 = [
  • “variant” = description variant = Ser73Arg
  • ];[ = closing symbol for allele-1, separator symbol two alleles, opening symbol for allele-2 = ];[
  • “variant” = description variant = Asn103del
  • ] = closing symbol for allele-2 = ]

Note

  • prefix reference sequence accepted is “p.” (protein).
  • predicted consequences, i.e. without experimental evidence (no RNA or protein analysed), should be given in parentheses, e.g. p.[(Arg727Ser)];[(Arg727Ser)]
  • when two variants are identified in a protein that derive from one chromosome (in cis) this should be described as “p.[variant1;variant2]”.
  • when two variants are identified in proteins that derive from different chromosomes (in trans) this should be described as “p.[variant1];[variant2]”.
  • when two variants are identified in a protein, but when it is not known whether these derive from one chromosome (in cis) or from different chromosomes (in trans), this should be described as “variant(;)variant2”, i.e. without using “[ ]”. NOTE: it is recommended to determine whether the changes are in the same protein or not.
  • parentheses enclosing predicted protein variants are listed around each variant and inside the square brackets of the allele; p.[(variant1);(variant2)]
  • when two variants are identified in two different proteins that derive from one variant at the DNA level (giving two different transcripts) the variants are separated using a “,”; p.[variant1,variant2]”.

Examples

For more examples see DNA alleles.

  • p.[Ser73Arg;Asn603del]
    a protein allele contains two different variants, p.Ser73Arg and p.Asn603del. The variants are found in cis.
  • p.[(Ser73Arg;Asn603del)]
    a protein allele contains two different predicted variants, p.Ser73Arg and p.Asn603del. The predicted variants are found in cis.
    NOTE: the parentheses are placed inside of the allele brackets
    NOTE: the description p.([Ser73Arg;Asn103del]) is not correct
  • p.[Ser73Arg;(Asn603del)]
    a protein allele contains two different variants, p.Ser73Arg and predicted variant p.Asn103del. The variants are found in cis.
  • p.[Ser73Arg];[Asn603del]
    the two protein alleles each contain a different variant, p.Ser73Arg and p.Asn603del. A heterozygous case (compound heterozygote, e.g. in a recessive disease). The variants are found in trans.
  • p.[(Ser73Arg)];[(Asn603del)]
    the two protein alleles each contain a different predicted variant, p.(Ser73Arg) and p.(Asn603del). A heterozygous case (compound heterozygote, e.g. in a recessive disease). The variants are found in trans.
    NOTE: the parentheses are placed inside of the allele brackets
    NOTE: the description p.([Ser73Arg];[Asn603del]) is not correct
  • p.[Ser73Arg];[Ser73Arg]
    both protein alleles contain the same variant, p.Ser73Arg. A homozygous case (e.g. in a recessive disease).
  • p.(Ser73Arg)(;)(Asn603del)
    two predicted protein variants are found, p.(Ser73Arg) and p.(Asn603del), but it is not known whether they derive from the same or from different alleles (chromosomes).
    NOTE: when it is not known on which allele a variant is, allele brackets should not be used
  • p.[Ser73Arg];[Ser73=]
    one protein allele contains a variant, p.Ser73Arg, the other protein allele contains the reference sequence, Ser73= (is wild-type).
    NOTE: for other variant types the format is p.[Ser73del];[Ser73=], p.[Ser73_Arg79dup];[Ser73_Arg79=], p.[Ser73_Ala74insSerGln];[Ser73_Ala74=], etc.
    NOTE: using p.= would mean the entire protein reference sequence was tested and found not changed
  • p.[Ser73Arg];[(?)]
    one protein allele contains a variant, p.Ser73Arg, while a variant for the other protein allele is expected but not yet identified (p.(?)) (e.g. in individuals affected by a recessive disease).
  • p.[Asn26His,Ala25_Gly29del]
    two different proteins, p.Asn26His and p.Ala25_Gly29del, derived from a variant on one chromosome (c.76A>T at the DNA level and r.[76a>c,73_88del] at the RNA level)
  • p.[Arg83=/Arg83Ser]
    a somatic case where a protein allele in some cells has a normal sequence (Arg83=), while other cells have a Ser at this position (Arg83Ser)
  • p.[Arg83=//Arg83Ser]
    a chimeric organism where a protein allele in one cell type has a normal sequence (Arg83=), while the other cell type has a Ser at this position (Arg83Ser).

Q&A

Was originally the recommendation to use the format [p.Ser73Arg+p.Asn103del]?

Indeed, originally den Dunnen and Antonarakis, 2000 the suggestion was to describe two changes in a protein encoded by one chromosome as [p.Ser73Arg+p.Asn103del], i.e. using a "+"-character to separate the two changes, while an earlier publication suggested to use a ";" ([p.Ser73Arg;p.Asn103del] (Antonarakis and the Nomenclature Working Group, 1998). To prevent confusion with older publications, to improve overall consistency and to keep descriptions as short as possible, the 2000 proposal was retracted. The recommended format is p.[Ser73Arg;Asn103del].

Can I describe the predicted protein consequences of two variants on the same allele as p.([Phe233Leu;Cys690Trp])?

No, this should be described as p.[(Phe233Leu);(Cys690Trp)], i.e. with parentheses inside the square brackets of the allele and around each variant. This format is used for overall consistency; with the parentheses inside the square brackets variants can be describe as p.[Phe233Leu;(Cys690Trp)] which would not be possible when they were allowed outside of the square brackets.

In recessive diseases, is it important I show which variants were found in which combination?

When in one individual you find more then one variant it is essential that you clearly indicate which variant(s) were found and in the protein from which allele(s);
  • disease severity will depend on the combination of variants found,
  • in recessive disease, when two variants are in the protein from one allele an individual is a carrier or you might not have found the variant in the protein from the 2nd allele.

I find the notation p.[Ser73Arg] without describing the second protein allele misleading; not enough researchers know this refers to only one of the two alleles present. Would using p.[Ser73Arg];[] be OK?

No, the recommended description is p.[Ser73Arg];[Ser73=], i.e. p.Ser73= for "no change" on the second protein allele.

How should I describe the variants detected in males and females for a protein encoded by the X-chromosome?

In females the description is straightforward, like p.[Ser73Arg];[76A=]. In males there is no second allele (X-chromosome) which can be described as p.[Ser73Arg];[0], i.e. using "p.0" to indicate the absence of a protein from the second X-chromosome.