Document Type

Article

Publication Date

10-14-2023

Abstract

In this paper, a natural probabilistic model for motif discovery has been used to experimentally test the quality of motif discovery programs. In this model, there are k background sequences, and each character in a background sequence is a random character from an alphabet, Σ. A motif G = g1g2 . . . gm is a string of m characters. In each background sequence is implanted a probabilistically-generated approximate copy of G. For a probabilistically-generated approximate copy b1b2 . . . bm of G, every character, bi , is probabilistically generated, such that the probability for bi 6= gi is at most α. We develop two new randomized algorithms and one new deterministic algorithm. They make advancements in the following aspects: (1) The algorithms are much faster than those before. Our algorithms can even run in sublinear time. (2) They can handle any motif pattern. (3) The restriction for the alphabet size is a lower bound of four. This gives them potential applications in practical problems, since gene sequences have an alphabet size of four. (4) All algorithms have rigorous proofs about their performances. The methods developed in this paper have been used in the software implementation. We observed some encouraging results that show improved performance for motif detection compared with other software.

Comments

Recommended Citation

Fu, Bin, Yunhui Fu, and Yuan Xue. 2013. "Sublinear Time Motif Discovery from Multiple Sequences" Algorithms 6, no. 4: 636-677. https://doi.org/10.3390/a6040636

Creative Commons License

This work is licensed under a Creative Commons Attribution 3.0 License.

Publication Title

Algorithms

DOI

10.3390/a6040636

Download

Included in

Computer Sciences Commons

COinS

Computer Science Faculty Publications and Presentations

Sublinear Time Motif Discovery from Multiple Sequences

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Creative Commons License

Publication Title

DOI

Included in

Browse

Search

Author Corner

Links

Computer Science Faculty Publications and Presentations

Sublinear Time Motif Discovery from Multiple Sequences

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Creative Commons License

Publication Title

DOI

Included in

Share

Browse

Search

Author Corner

Links