Predicting Complications in Critical Care Using Heterogeneous Clinical Data

Patients in hospitals, particularly in critical care, are susceptible to many complications affecting morbidity and mortality. Digitized clinical data in electronic medical records can be effectively used to develop machine learning models to identify patients at risk of complications early and provide prioritized care to prevent complications. However, clinical data from heterogeneous sources within hospitals pose signi ficant modeling challenges. In particular, unstructured clinical notes are a valuable source of information containing regular assessments of the patient's condition but contain inconsistent abbreviations and lack the structure of formal documents. Our contributions in this paper are twofold. First, we present a new preprocessing technique for extracting features from informal clinical notes that can be used in a classifi cation model to identify patients at risk of developing complications. Second, we explore the use of collective matrix factorization, a multi-view learning technique, to model heterogeneous clinical data text-based features in combination with other measurements, such as clinical investigations, comorbidities, and demographic data. We present a detailed case study on postoperative respiratory failure using more than 700 patient records from the MIMIC II database. Our experiments demonstrate the ef ficacy of our preprocessing technique in extracting discriminatory features from clinical notes as well as the bene fits of multi-view learning to combine clinical measurements with text data for predicting complications.

Keywords

Clinical notes, topic models, heterogeneous data, multi -view learning, collective matrix factorization, postoperative respiratory failure

Persistent link

http://hdl.handle.net/10919/82380

Collections

Destination Area: Data and Decisions (D&D)
Scholarly Works, Computer Science

Full item page