Mining poly-regions in DNA
2012 (English)In: International journal on data mining and bioinformatics, ISSN 1748-5673, Vol. 6, no 4, 406-428 p.Article in journal (Refereed) Published
We study the problem of mining poly-regions in DNA. A poly-region is defined as a bursty DNA area, i.e., area of elevated frequency of a DNA pattern. We introduce a general formulation that covers a range of meaningful types of poly-regions and develop three efficient detection methods. The first applies recursive segmentation and is entropy-based. The second uses a set of sliding windows that summarize each sequence segment using several statistics. Finally, the third employs a technique based on majority vote. The proposed algorithms are tested on DNA sequences of four different organisms in terms of recall and runtime.
Place, publisher, year, edition, pages
InderScience Publishers, 2012. Vol. 6, no 4, 406-428 p.
Research subject Computer and Systems Sciences
IdentifiersURN: urn:nbn:se:su:diva-100729DOI: 10.1504/IJDMB.2012.049278OAI: oai:DiVA.org:su-100729DiVA: diva2:695737