RNA Recommendations

Substitution Variant


a sequence change where, compared to a reference sequence, one nucleotide is replaced by one other nucleotide.


Format: “prefix”“position_substituted”“reference_nucleotide””>”new_nucleotide”, e.g. r.123a>g

“prefix” = reference sequence used = r.
“position_substituted” = position nucleotide sustituted = 123
“reference_nulceotide” = nucleotide at reference position = a
”>” = type of change is a substitution = >
“new_nucleotide” = substituted nucleotide = g


  • all variants should be described at the DNA level, descriptions at the RNA and/or protein level may be given in addition
  • prefix reference sequences accepted are r. (coding and non-coding RNA).
  • substitutions involving two or more consecutive nucleotides are described as deletion/insertions (indels) (see Deletion/insertion (delins)).
  • two substitutions separated by one or more nucleotides should be described individually and not as a “delins”
    • exception: two variants separated by one nucleotide, together affecting one amino acid, should be described as a “delins” (e.g. r.142_144delinsugg (p.Arg48Trp)).
      NOTE: this prevents tools predicting the consequences of a variant to make conflicting and incorrect predictions of two different substitutions at one position
  • nucleotides that have been tested and found not changed are described as r.109u=, r.4567_4569= (see SVD-WG001 (no change)).
  • it is not correct to describe “polymorphisms” as r.76a/g (see Discussions).


  • LRG_199t1:r.76a>c
    a substitution of the “a” nucleotide at r.76 with a “c”
  • LRG_199t1:r.123=
    a screen was performed showing that nucleotide r.123 was a “c” as in the coding DNA reference sequence (the nucleotide was not changed). Alternative NM_004006.1:r.123c=.
  • NM_004006.1:r.-14a>c
    a “a” to “c” substitution 14 nucleotides 5’ of the ATG translation initiation codon
  • LRG_199t1:r.*41u>a
    a “u” to “a” substitution 41 nucleotides 3’ of the translation termination codon
  • LRG_199t1:r.[897u>g,832_960del]
    two different transcripts, 897u>g and r.832_960del, derive from one variant (LRG_199t1:c.897T>G at the DNA level)
  • the description r.76_77delinsug is preferred over r.[76a>u;77a>g]
    NOTE: based on the definition of a substitution, i.e. one nucleotide replaced by one other nucleotide, this change can not be described as a substitution like r.76_77aa>ug or r.76aa>ug
  • NM_004006.1:r.0
    no RNA from the variant allele could be detected
  • LRG_199t1:r.spl
    RNA has not been analysed but it is very likely that splicing is affected
  • LRG_199t1:r.?
    an effect on the RNA level is expected but it is not possible to give a reliable prediction of the consequences (RNA not analysed)
  • LRG_199t1:r.85=/u>c
    a mosaic case where at position 85 besides the normal sequence (a U, described as “=”) also transcripts are found containing a C (r.85u>c)
    NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first
  • LRG_199t1:r.85=//u>c
    a chimeric case, i.e. the sample is a mix of cells containing r.85= and r.85u>c.
    NOTE: irrespective of the frequency in which each nucleotide was found, the reference is always described first


When I only sequenced RNA (cDNA) and not genomic DNA should I then give the description of a variant at DNA level in parenthesis?

Yes, while the variant at RNA level can be described as r.76a>g on DNA level, based on a coding DNA reference, sequence it should be described as c.(76A>G).

Are polymorphisms described like r.76a/g?

No, all substitutions are described as r.76a>g. In the past, the format r.76a/g has been used to describe "polymorphic" sequence variants. Note that a description should be neutral, simply describe the change, and not include any other information like predicted or known functional consequences.

Can I describe a GC to TG variant as a dinucleotide substitution (r.4gc>ug)?

No, this is not allowed. By definition a substitution changes one nucleotide into one other nucleotide. The change "ugugcca" to "uguugca" should be described as a deletion/insertion (indel) as r.4_5delinsug.