The performance of the algorithm was then tested with a wide range of beam widths, using 15 different datasets consisting of groups of random genes with motifs from the TRANSFAC database planted in them. The results are shown in Figure 5. The performance was assessed in terms of False Positive (FP) and False Negative (FN) rates, where a False Positive is defined as a motif that was not planted, but that appears in the top q motifs returned by BEAM, where q motifs were planted in the data set. A False Negative, on the other hand, is defined to occur when a planted motif does not appear in the top q motifs returned by BEAM. In terms of FP and FN rates, the algorithm is robust to broad changes in the beam width. Beam widths of 1000 and above show no statistically significant differences in the FP and FN rates relative to a full enumeration. Thus, BEAM produces results equivalent to doing a complete enumeration of all motifs, with the added advantage of reducing the FP rate.
We then tested the overall accuracy of the motifs returned by BEAM, running the algorithm on S. Cerevisiae regulons. We assessed the performance of BEAM using the Phi score (Pevzner and Sze, 2000). This metric has been employed in a number of published comparisons of motif-finding algorithms (Pevzner and Sze, 2000; Sinha and Tompa, 2000; Shinozake et al., 2003). Each of the top three motifs was evaluated in terms of its Phi score and the best score was reported. When the biologically relevant motifs are over-represented, such as in the RAP1 and REB1 regulons, BEAM run with an over-representation objective function returns Phi scores that are considerably higher than other motif finding programs (Table 1).
A further consideration in motif-finding is that some biologically relevant motifs are not statistically over-represented. Working from the hypothesis that the biologically relevant motifs must differ from the surrounding upstream sequences in some way in order for them to be biologically active, we tried two other objective functions on a number of different yeast regulons from the SCPD database for which other motif finding programs had failed to find the biologically relevant motifs. The results are also shown in Table 1. The objective functions were the Kolmogorov-Smirnov (K-S) statistic, which measures differences in the positional bias in the occurrence of a motif between the selected group of genes and the genome as a whole, and statistical under-representation. In the case of the K-S statistic, for two regulons (BAS1, and TBP, whose reported motif is the TATA box), BEAM was able to find motifs with a 0.24 and 0.37 Phi score respectively while three other motif finding programs, AlignACE, YMF and MEME (reported in Sinha and Tompa, 2003), all returned Phi scores of between 0 and 0.03 for these two motifs. By narrowing its focus to the most promising candidate motifs, BEAM is able to identify biologically characterized motifs using a computationally expensive objective function, such as K-S. In addition, we obtained the surprising and somewhat counter-intuitive result that when BEAM is run with an under-representation objective function, it is able to find the biologically relevant motif in regulons where none of the popular motif finding programs were successful. In the case of the HAP2 and PHO2 regulons from the SCPD database, BEAM run with an under-representation objective function returns motifs with Phi scores of 0.38 and 0.18 respectively, a significant improvement over the Phi scores reported by other programs.
We next evaluated a method for combining the non-degenerate motifs identified by BEAM into higher scoring degenerate motifs using another algorithm called PRISM (Pattern Relaxation-based Iterative Search for Motifs). We planted multiple copies of a highly degenerate motif using its non-degenerate instantiations and examined the relationship between the Sig scores of each instantiations and the Sig score of their union (e.g. AAW is the union of AAA and AAT). First, we looked at the relationship between the sum of the Sig scores of two non-degenerate motifs that are one mismatch apart and the Sig score of their union. For 26 pairs of instantiations, we found that there was an extremely tight correlation (R2= 0.9978) between the sum of the Sig scores of planted non-degenerate instantiations and the Sig score of the corresponding singly degenerate instantiations (Figure 6, circles). Similarly, we looked at 12 pairs of non-degenerate motifs that are two mismatches apart and found an extremely tight correlation (R2= 0.9868) between the Sig scores of non-degenerate motifs and the Sig scores of their corresponding doubly degenerate instantiations (Figure 6, squares). Finally, we looked at 12 pairs of singly degenerate motifs that combine to yield a motif of a higher degeneracy, and once again found an extremely tight correlation (R2=0.9948) between the Sig scores of singly degenerate motifs and the Sig scores of their corresponding doubly degenerate instantiations (Figure 6, triangles). The extreme linearity in these experiments suggests a mathematical dependency and validates our approach.