Gregory, Welliver, Chong (2020) Top 10 Reviewer Critiques of Radiology Artificial Intelligence (AI) Articles: Qualitative Thematic Analysis of Reviewer Critiques of Machine Learning/Deep Learning Manuscripts Submitted to JMRI Journal of magnetic resonance imaging : JMRI ()

Abstract

Classical machine learning (ML) and deep learning (DL) articles have rapidly captured the attention of the radiology research community and comprise an increasing proportion of articles submitted to JMRI, of variable reporting and methodological quality. To identify the most frequent reviewer critiques of classical ML and DL articles submitted to JMRI. Qualitative thematic analysis. In all, 1396 manuscript journal articles submitted to JMRI for consideration in 2018, with thematic analysis performed of reviewer critiques of 38 artificial intelligence (AI) articles, comprised of 24 ML and 14 DL articles, from January 9, 2018 to June 2, 2018. N/A. After identifying and sampling ML and DL articles, and collecting all reviews, qualitative thematic analysis was performed to identify major and minor themes of reviewer critiques. Descriptive statistics provided of article characteristics, and thematic review of major and minor themes. Thirty-eight articles were sampled for thematic review: 24 (63.2%) focused on classical ML and 14 (36.8%) on DL. The overall acceptance rate of classical ML/DL articles was 28.9%, similar to the overall 2017-2019 acceptance rate of 23.1-28.1%. These articles resulted in 72 reviews analyzed, yielding a total 713 critiques that underwent formal thematic analysis consensus encoding. Ten major themes of critiques were identified, with 1-Lack of Information as the most frequent, comprising 268 (37.6%) of all critiques. Frequent minor themes of critiques concerning ML/DL-specific recommendations included performing basic clinical statistics such as to ensure similarity of training and test groups (N = 26), emphasizing strong clinical Gold Standards for the basis of training labels (N = 19), and ensuring strong radiological relevance of the topic and task performed (N = 16). Standardized reporting of ML and DL methods could help address nearly one-third of all reviewer critiques made. 4 Technical Efficacy Stage: 1 J. Magn. Reson. Imaging 2020. © 2020 International Society for Magnetic Resonance in Medicine.

Links

http://www.ncbi.nlm.nih.gov/pubmed/31943495
http://dx.doi.org/10.1002/jmri.27035

Tools