Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse

Dligach, Afshar, Miller (2019) Toward a clinical text encoder: pretraining for clinical natural language processing with applications to substance misuse J Am Med Inform Assoc (IF: -1) 26(11) 1272-1278
Full Text
Full text

Click the PDF icon to view the full text of the paper

Abstract

Our objective is to develop algorithms for encoding clinical text into representations that can be used for a variety of phenotyping tasks.Obtaining large datasets to take advantage of highly expressive deep learning methods is difficult in clinical natural language processing (NLP). We address this difficulty by pretraining a clinical text encoder on billing code data, which is typically available in abundance. We explore several neural encoder architectures and deploy the text representations obtained from these encoders in the context of clinical text classification tasks. While our ultimate goal is learning a universal clinical text encoder, we also experiment with training a phenotype-specific encoder. A universal encoder would be more practical, but a phenotype-specific encoder could perform better for a specific task.We successfully train several clinical text encoders, establish a new state-of-the-art on comorbidity data, and observe good performance gains on substance misuse data.We find that pretraining using billing codes is a promising research direction. The representations generated by this type of pretraining have universal properties, as they are highly beneficial for many phenotyping tasks. Phenotype-specific pretraining is a viable route for trading the generality of the pretrained encoder for better performance on a specific phenotyping task.We successfully applied our approach to many phenotyping tasks. We conclude by discussing potential limitations of our approach.© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Links

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6798566
http://www.ncbi.nlm.nih.gov/pubmed/31233140
http://dx.doi.org/10.1093/jamia/ocz072

Similar articles

Tools