ISSN : 2349-3917
Exact string-matching algorithms have become very supreme in many bioinformatics tools. Despite the abundance and diversity of such algorithms, exposing them to real-time experimental analysis has been critical. This study was conducted to evaluate the efficiency of ten exact-string matching algorithms on large-scale genomic sequences from a runtime perspective. To define the most efficient algorithms are qualified to handle the short alphabet used for nucleic acid coding. The methodology promoted for this study was the factorial experiment with Randomized Complete Block Design (FRCBD). Under influence of four independent parameters, four levels of pattern lengths, four levels of pattern indices, two levels of programming languages, and ten levels of algorithmic architecture. The yield of the tested algorithms was calculated in nanoseconds. One-way ANOVA and Two-way ANOVA tests with post-hoc Games-Howell test were used separately for statistical analysis. In this study two widely accepted programming languages, C# and JAVA were used to speculate the possible effect of programing language on algorithm performance.