Motif regular expression

Motif regular expression is a notation commonly used for representing motifs of amino acids or nucleotides. The following conventions are used for motif regular expression:

Thus, the amino acid pattern "AR[ND]C?E" encompasses the four protein strings "ARNCE", "ARDCE", "ARNE", and "ARDE"; the DNA pattern "CC{T}AG" encompasses the three DNA strings "CCAAG", "CCCAG", and "CCGAG".