ABSTRACT
Simple sequence repeats (SSRs) are molecular genetic markers that are powerful tools in genomics studies; SSR markers are routinely mined as a part of genetic workflows. Here, we developed a novel SSR mining algorithm based on regular expression that can reduce the complexity of commonly used SSR mining software. We used the following SSR mining regular expression: ({i, j}?) (\1) {k}, where i and j denote the minimum and maximum lengths of the motifs of the SSR sequence, respectively, and k is the minimum number of repeat motifs. From this SSR mining algorithm, we developed an SSR sequence analysis software (named “regexSSRw”) that is capable of mining eligible SSR loci from FASTA format sequences; regexSSRw can be accessed at https://github.com/renm79/rgxSSRw. This SSR mining algorithm can aid a range of applications, from being used by programmers in the development of SSR mining software to being implemented by scholars into their SSR marker workflow.
Competing Interest Statement
The authors have declared no competing interest.