Complutense University Library

Prediction of methylated CpGs in DNA sequences using a support vector machine

Bhasin, Manoj and Zhang, Hong and Reinherz, Ellis L and Reche, Pedro A (2005) Prediction of methylated CpGs in DNA sequences using a support vector machine. FEBS Letters, 579 (20). pp. 4302-8. ISSN 0014-5793

[img] PDF
299kB

Official URL: http://www.elsevier.com/wps/find/journaldescription.cws_home/506085/description#description

View download statistics for this eprint

==>>> Export to other formats

Abstract

DNA methylation plays a key role in the regulation of gene expression. The most common type of DNA modification consists of the methylation of cytosine in the CpG dinucleotide. At the present time, there is no method available for the prediction of DNA methylation sites. Therefore, in this study we have developed a support vector machine (SVM)-based method for the prediction of cytosine methylation in CpG dinucleotides. Initially a SVM module was developed from human data for the prediction of human-specific methylation sites. This module achieved a MCC and AUC of 0.501 and 0.814, respectively, when evaluated using a 5-fold cross-validation. The performance of this SVM-based module was better than the classifiers built using alternative machine learning and statistical algorithms including artificial neural networks, Bayesian statistics, and decision trees. Additional SVM modules were also developed based on mammalian- and vertebrate-specific methylation patterns. The SVM module based on human methylation patterns was used for genome-wide analysis of methylation sites. This analysis demonstrated that the percentage of methylated CpGs is higher in UTRs as compared to exonic and intronic regions of human genes. This method is available on line for public use under the name of Methylator at http://bio.dfci.harvard.edu/Methylator/.

Item Type:Article
Uncontrolled Keywords:DNA; CpG; Methylation; Support vector machine; Prediction
Subjects:Medical sciences > Biology > Molecular biology
Sciences > Computer science > Bioinformatics
ID Code:9330
Deposited On:14 Aug 2009 11:05
Last Modified:30 Sep 2009 10:44

Repository Staff Only: item control page