TY - JOUR T1 - Tumor Origin Detection with Tissue-Specific miRNA and DNA methylation Markers JF - bioRxiv DO - 10.1101/090746 SP - 090746 AU - Wei Tang AU - Shixiang Wan AU - Quan Zou Y1 - 2016/01/01 UR - http://biorxiv.org/content/early/2016/12/01/090746.abstract N2 - Motivation Cancer of unknown primary origin constitutes 3-5% of all human malignancies. Patients with these carcinomas present with metastases without an established primary site, which may not be found even by thorough histological search methods. Patients with cancer of unknown primary origin always have poor prognosis and hardly have efficient treatment since most cancers respond well to specific chemotherapy or hormone drugs. Many studies have proposed classifiers based on miRNAs or mRNAs to predict the tumor origins, but few study focus on high-dimensional DNA methylation profiles.Results We introduced three classifiers with novel feature selection algorithm combined with random forest to effectively identify highly tissue-specific epigenetics biomarkers such as microRNAs and CpG sites, which can help us predict the origin site of tumors. This algorithm, incorporating differential analysis and descending dimension algorithm, was applied on 14 histological tissues and over 5000 samples based on miRNA expression and DNA methylation profiles to assign given primary tumor to its origin tissue. Our study shows all of these three classifiers have an overall accuracy of 87.78% (72.55%-97.54%) based on miRNA datasets and an accuracy of 96.43% (MRMD: 87.85%-99.76%) or 97.06% (PCA: 92.44%-100%) based on DNA methylation datasets on predicting the origin of tumors and suggests that the biomarkers we selected can efficiently predict the origin of tumors and allow the clinicians to avoid adjuvant systemic therapy or to choose less aggressive therapeutic options. We also developed a user-friendly webserver which enables users to predict the origin site of tumors by uploading the miRNAs expression or DNA methylation profiles of those cancers.Availability The webserver, data, and code are accessible free of charge at http://server.malab.cn/MMCOP/Contact zouquan{at}nclab.netSupplementary information Supplementary data are available at Bioinformatics online. ER -