Biblioteca de la Universidad Complutense de Madrid

Testing the order of Markov dependence in DNA sequences


Pardo Llorente, Leandro y Menéndez Calleja, María Luisa y Pardo Llorente, María del Carmen y Zografos, Konstantinos (2011) Testing the order of Markov dependence in DNA sequences. Methodology and computing in applied probability, 13 (1). pp. 59-74. ISSN 1387-5841

[img] PDF
Restringido a Sólo personal autorizado del repositorio hasta 31 Diciembre 2020.


URL Oficial:


DNA or protein sequences are usually modeled as probabilistic phenomena. The simplest model is created on the assumption that the nucleotides at the various sites are independently distributed. Usually the type of nucleotide at some site depends on the type at another site and therefore the DNA sequence is modeled as a Markov chain of random variables taking on the values A, G, C and T corresponding to the four nucleotides. First order or higher order Markov models provide better fit to a DNA sequence. Based on this remark, the aim of this paper is to present and study a family of test statistics for testing order Markov dependence in DNA sequences. This new family includes as a particular case the classical likelihood ratio test. A simulation study is presented in order to find test statistics, in this family, with a better behaviour than the likelihood ratio test.

Tipo de documento:Artículo
Palabras clave:DNA sequence; Markov dependence; Likelihood ratio test; Phi-divergence test statistics; Divergence; Chain
Materias:Ciencias > Matemáticas > Estadística aplicada
Código ID:17330

Avery PJ, Henderson DA (1999) Fitting Markov chain models to discrete state series such as DNA sequences. Appl Stat 48:53–61

Bejerano G, Friedman N, Tishhy N (2004) Efficient exact p-value computation for small sample, sparse and surprising categorical data. J Comput Biol 11:867–886

Bell GI, Sánchez-Pescador R, Laybourn PJ, Najarian RC (1983) Exon duplication and divergence in the human preproglucagon gene. Nature 304:368–371

Billingsley P (1961a) Statistical methods in Markov chains. Ann Math Stat 32:13–39

Billingsley P (1961b) Statistical inference for Markov processes. The University of Chicago Press, Chicago

Ewens WJ, Grant GR (2005) Statistical methods in bioinformatics (2nd edn). Springer, New York.

Hoel PG (1954) A test for Markov chains. Biometrika 14:430–433

Menéndez ML, Pardo JA, Pardo L (2001) Csiszar’s ϕ-divergences for testing the order in a Markov chain. Stat Pap 42:313–328

Menéndez ML, Pardo JA, Pardo L, Zografos K (2006) On tests of independence based on minimum φ-divergence estimator with constraints: an application to modeling DNA. Comput Stat Data Anal 51(2):1100–1118

Patel NR (2003) An exact test for homogeneity of a Markov chain.

Pardo L (2006) Statistical inference based on divergence measures. Chapman & Hall/CRC, New York

Pardo L,Morales D, Salicrú M, MenéndezML (1993) The ϕ-divergence statistic in bivariate multinomial populations including stratification. Metrika 40:223–235

Read TRC, Cressie NAC (1988) Goodness-of-fit statistics for discrete multivariate data. Springer, New York

Reinert G, Schbath S, Waterman MS (2000) Probabilistic and statistical properties of words: and overview. J Comput Biol 7:1–46

Zografos K (1993) Asymptotic properties of φ-divergence statistic and applications in contingency tables. Int J Math Stat Sci 2:5–21

Depositado:05 Dic 2012 09:20
Última Modificación:07 Feb 2014 09:45

Sólo personal del repositorio: página de control del artículo