Reference : A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs wit...
Scientific journals : Article
Life sciences : Multidisciplinary, general & others
http://hdl.handle.net/10993/27104
A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length
English
Favorov, A.A. []
Gelfand, M.S. []
Gerasimova, A.V. []
Ravcheev, Dmitry mailto [University of Luxembourg > Luxembourg Centre for Systems Biomedicine (LCSB) > >]
Mironov, A.A. []
Makeev, V.J. []
2005
Bioinformatics
Oxford University Press - Journals Department
21
10
2240-2245
Yes (verified by ORBilu)
1367-4803
1460-2059
Oxford
United Kingdom
[en] Motivation: Transcription regulatory protein factors often bind DNA
as homo-dimers or hetero-dimers. Thus they recognize structured
DNA motifs that are inverted or direct repeats or spaced motif
pairs. However, these motifs are often difficult to identify owing to
their high divergence. The motif structure included explicitly into
the motif recognition algorithm improves recognition efficiency for
highly divergent motifs as well as estimation of motif geometric
parameters.
Result: We present a modification of the Gibbs sampling motif extraction
algorithm, SeSiMCMC (Sequence Similarities by Markov Chain
Monte Carlo), which finds structured motifs of these types, as well
as non-structured motifs, in a set of unaligned DNA sequences. It
employs improved estimators of motif and spacer lengths. The probability
that a sequence does not contain any motif is accounted for in a
rigorous Bayesian manner. We have applied the algorithm to a set of
upstream regions of genes from two Escherichia coli regulons involved
in respiration. We have demonstrated that accounting for a symmetric
motif structure allows the algorithm to identify weak motifs more accurately.
In the examples studied, ArcA binding sites were demonstrated
to have the structure of a direct spaced repeat, whereas NarP binding
sites exhibited the palindromic structure.
Availability: The WWW interface of the program, its FreeBSD (4.0) and Windows 32 console executables are available at http://bioinform.genetika.ru/SeSiMCMC
http://hdl.handle.net/10993/27104
10.1093/bioinformatics/bti336
http://bioinform.genetika.ru/SeSiMCMC

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Limited access
03. Favorov et al., 2005 (SeSiMCMC).pdfPublisher postprint117.45 kBRequest a copy

Bookmark and Share SFX Query

All documents in ORBilu are protected by a user license.