Predictive Modeling for Metabolomics Data

Ghosh, Zhang, Ghosh, Kechris (2020) Predictive Modeling for Metabolomics Data Methods Mol Biol (IF: -1) 2104 313-336
Full Text
Full text

Click the PDF icon to view the full text of the paper

Abstract

In recent years, mass spectrometry (MS)-based metabolomics has been extensively applied to characterize biochemical mechanisms, and study physiological processes and phenotypic changes associated with disease. Metabolomics has also been important for identifying biomarkers of interest suitable for clinical diagnosis. For the purpose of predictive modeling, in this chapter, we will review various supervised learning algorithms such as random forest (RF), support vector machine (SVM), and partial least squares-discriminant analysis (PLS-DA). In addition, we will also review feature selection methods for identifying the best combination of metabolites for an accurate predictive model. We conclude with best practices for reproducibility by including internal and external replication, reporting metrics to assess performance, and providing guidelines to avoid overfitting and to deal with imbalanced classes. An analysis of an example data will illustrate the use of different machine learning methods and performance metrics.

Links

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7423323
http://www.ncbi.nlm.nih.gov/pubmed/31953824
http://dx.doi.org/10.1007/978-1-0716-0239-3_16

Similar articles

Tools