Projects

Big Data Synchrophasor Monitoring and Analytics for Resiliency Tracking (BDSMART)

Funding:
U.S. Department of Energy (DOE)

The objective of this project was to utilize Big Data Analytics (BDA) to automate the monitoring of power systems from synchrophasor recordings. Such automation can improve assessing disturbance events that may affect power system resilience. As one of the project’s central aims, the effects of different expert-assisted scenarios for improved labeling of event data, on event classification accuracy, was thoroughly analyzed. To that end, a large-scale comparative analysis was conducted by assessing various traditional as well as more sophisticated event classification models on a dataset involving two years of synchrophasor measurements taken at various locations across one major interconnection of the U.S. power grid. 

The experimental findings on rapidly refined, partially and fully inspected event labels provided evidence that convolutional neural networks (CNNs) outperform traditional models, regardless of the quality of the available event labels. When using such models, the event classification performance improved as more PMU signals were inspected by a domain expert. Smaller fractions of fully inspected signals typically yielded higher accuracy than using them in addition to rapidly refined signals. Finally, it was observed that performance similar to the one obtained using entirely domain-driven labeling may be achieved as long as the expert is experienced enough not to mislabel more than ~5% of the event data.

UAS Detection and Counter-UAS Research and Development

Funding:
U.S. Air Force Research Laboratory (AFRL)

This project had two objectives related to operation of drone swarms in GPS-denied environments: (1) the development of a structured model of environmental deviance to aid in autonomous navigation, and (2) the integration of such a model into a collision avoidance system. Both of these objectives were achieved and the outcomes were tested in the framework of a simulated environment that mimics a GPS-denied scenario. Using data from hundreds of simulated swarm flights, the obtained findings indicated that structured learning can improve navigational accuracy without the need for externally provided position feedback.

Disease Detection and Disease Progression Modeling

Funding:
IQVIA

The objective of this project was to determine whether diagnostics of Alzheimer’s disease (AD) from EMR data alone (without relying on diagnostic imaging) could be significantly improved by applying clinical domain knowledge in data preprocessing and positive patient cohort selection rather than setting naive filters. Data were extracted from a repository of heterogeneous ambulatory EMR data, collected from primary care medical offices all over the United States. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a recurrent deep neural network (RNN) model to predict if a given patient may develop AD. The RNN model that used data relevant to AD performed significantly better when learning from the SCRP dataset as opposed to when datasets were selected naively. The integration of qualitative medical knowledge for dataset selection and deep learning techniques provided a mechanism for significant improvement of AD prediction.

Clinical Decision Support System (CDSS) for Multiple Choice Ranking in Cancer Comorbidity

Funding:
King Abdullah University of Science and Technology’s Center Partnership Fund Program

The goal of this project was to identify comorbidities and genes associated with Colorectal cancer (CRC) – the third most common cancer in the United States and the second leading cause of cancer death. The comorbidities of CRC were studied by designing a novel comorbidity network model based on the State Inpatient Database (SID) for the state of California, the records in which were collected under the Healthcare Cost and Utilization Project (HCUP). Ranked lists of comorbidities and comorbidity networks were created, and the prevalence of comorbidities in different stages of CRC was determined. The comorbidity lists were utilized for text mining of PubMed and DisGeNET in order to extract genes associated with CRC. The results of the comorbidity network analyses indicated which comorbidities of CRC are highly expected. The discovered genes could be used to recruit more individuals who would benefit from genetic consultations. The identified associations between the comorbidities, CRC, and shared genes can have important implications on early discovery, and prognosis of CRC.

Big Data Analytics for Clinical Trials Optimization

Funding:
QuintilesIMS

Clinical trials optimization was facilitated by developing DeepMatch (DM), a novel approach based on the recent advances in deep learning. DM was designed to learn from both investigator and trial-related heterogeneous data sources and rank investigators based on their expected enrollment performance on new clinical trials.

A large-scale evaluation was conducted on 2618 studies in which the proposed ranking-based framework improved the current state-of-the-art by up to 19% on ranking investigators and up to 10% on detecting top/bottom performers when recruiting investigators for new clinical trials. These findings indicated that DM can provide substantial improvement over current industry standards in several regards: (1) the enrollment potential of the investigator list, (2) the time it takes to generate the list, and (3) data-informed decisions about new investigators.