Time-series in healthcare: challenges and solutions.
In this post we will discuss the challenges and solution using time series data of patients over time trained over deep learning methods and try to early diagnose the disease of a patient .
Machine Learning cannot do medicine but can provide data driven interpretable, trustworthy , actionable information
System is created for screening à diagnosis -à treatment by creating predictive models to give patient level recommendation and support
To work with longitudinal time series data different machine learning and deep learning models are used like RNN,LSTM,CNN , Gaussian process
Challenges:
Different patients have different genetic and disease characteristics which makes it very difficult to make accurate predictions.
Data is of different forms like image, text which makes it challenging to create training dataset.
Markov model is designed to predict the disease in the patients and forecast based on time series data . It has a limitation that it can me used for one disease at a time and it gives generalized results and not specific to a particular patient
To overcome Markov model limitation Deep learning models like RNN is used for getting patient specific result and diagnosis of multiple diseases at a time.
Disadvantage of deep learning methods like RNN is that is not easily interpretable , so medical practitioners cannot use it easily
To overcome the disadvantages of Markov Model and deep learning models ASSM is used
Attentive State Space Models (ASSMs) are a class of probabilistic models that combine the concepts of state space models and attention mechanisms. State space models are used to describe the evolution of latent states(hidden information like hidden topics in LDA) over time,so we can assign the latent states we are interested in and get specific results, often in the context of time-series data, while attention mechanisms allow models to focus on specific parts of the input data when making predictions or decisions.
The idea behind Attentive State Space Models is to enhance the traditional state space modeling framework by incorporating attention mechanisms, which allow the model to selectively emphasize or ignore certain parts of the data sequence at different time steps. This can lead to more accurate and interpretable predictions.
Below is the typical architecture of neural network used in the research with loss function specific to the research .
Clustering:
It is done to find similar type of patients by anaylysing the similarity in the time series data of different patients using methods like Dynamic Time Wrapping
Below is the illustration of how a patient be assigned to different clusters based on the observation overtime.
Screening and Monitoring :
Time interval at which patient is to be screened to make diagnosis cost effective is very important . For that error is minimized on different time interval dataset and the best interval at which patient is to be diagnosed to make the diagnosis cost effective.
Early Diagnosis:
Below model is used to do early diagnosis . Here based on previous data where we know at what time early symptoms occur we can use that information to detect disease early by screening the patient on time t5 instead of t7 and prevent the spread of disease .
AutoML:
It is important to decide which model to use because we have multiple models like LSTM,GRU,RNN . For that accuracy is calculated on theses models with same length time series data .
Interpretability:
It is important that the result made by machine learning models is explainable to the medical professionals so that the model become interpretable to professionals . For this most important features are detected which explains the result which the model predicts.
Feature Importance:
To detect the most importance feature or salient features previous value is changed and the effect on future output is captured . If the future outcome value is changing then feature is important.
Black Box and Masking:
Masking is done to hide certain non empty values and find the error as compared to original value . Then the mask is updated based on error to minimize error . This method is used to find most important features.
Uncertainty estimation:
For every prediction we make we need to be confident on how much accurate result we are getting because in healthcare confidence in result is very important . For this we resample the input on different features for a particular datapoint and find the result and range in which the result lies for a particular datapoint is the confidence interval.
Dealing with missing data :
For this Multidimensional -RNN is used and dropout layer is added which takes care of the missing values