Posters

Presenting Author

Sunakhi Sahoo

Presenting Author Academic/Professional Position

High School Student

Academic Level (Author 1)

High School Student

Discipline/Specialty (Author 1)

Population Health and Biostatistics

Academic Level (Author 2)

Faculty

Discipline/Specialty (Author 2)

Population Health and Biostatistics

Presentation Type

Poster

Discipline Track

Biomedical ENGR/Technology/Computation

Abstract Type

Research/Clinical

Abstract

Background: Diabetic heart failure (DHF) is defined as a chronic and progressive disease which is associated with both diabetes and heart failure (HF). Even though there have been many developments in the knowledge of these diseases, there is still much to learn about the genetic crossovers between the two. In this study, we identified genes that are associated with diabetic heart failure and heart failure by using gene expression data from patients with DHF, HF, and a control group of patients who died of natural causes. We sought to identify genes that had altered expression levels which could possibly play a role in the disease pathways.

Methods: The data set used in the study was formatted as a matrix which included three groups: 5 control samples (individuals who died naturally), 7 samples from DHF and 12 samples from non-DHF. All the data analysis were done using CRAN package in R and RStudio. Volcano plots were used to determine the differentially expressed genes according to the log2 fold-change and ttest. We used machine learning models, including Naive Bayes, Random Forest, and Logistic Regression, that were trained using confusion matrices and 5-fold cross-validation to classify control and diseased groups. We also used functional pathway enrichment tools, such as EnrichR and StringPPI, for the identification of biological processes and genetic pathways associated with the identified genes. Also, to identify gene expression clustering and the relationships between the sample groups, data visualization methods such as UMAPs (Uniform Manifold Approximation and Projection) and heatmaps were employed.

Results: From the volcano plots, 149 genes were seen to be differentially expressed in the diseased groups. EnrichR analysis revealed that these genes were linked to the pathologies like lipid and atherosclerosis, insulin resistance and several signaling pathways. The UMAP analysis highlighted clear distinction between the control and the diseased groups with some degree of overlap between the DHF and HF groups. The accuracy of the predictive models was as follows: Naive Bayes = 100%, Random Forest = 80% and Logistic Regression = 60%. This is supported by the heat maps, which were similar to the UMAP results, where the genes that were similarly expressed were grouped into one cluster within each group. Further analysis of the regulated pathways showed that there were close interactions between the genes that were highly expressed, which may help to explain the mechanisms of DHF and HF.

Conclusion: Our study provided evidence of 149 genes with altered expression in diabetic heart failure and heart failure and their relationships with the important biological processes. The findings of this study also showed that the machine learning models have a high potential to differentiate between healthy and diseased conditions with the highest accuracy being displayed by the Naïve Bayes model. The results of the bioinformatics analysis, the pathway analyses and the gene mappings enabled the identification of genetic relationships and disease mechanisms and thus will form a basis for future studies.

Share

COinS
 

Unraveling Genetic Links Between Diabetes and Heart Failure-A Machine Learning Approach

Background: Diabetic heart failure (DHF) is defined as a chronic and progressive disease which is associated with both diabetes and heart failure (HF). Even though there have been many developments in the knowledge of these diseases, there is still much to learn about the genetic crossovers between the two. In this study, we identified genes that are associated with diabetic heart failure and heart failure by using gene expression data from patients with DHF, HF, and a control group of patients who died of natural causes. We sought to identify genes that had altered expression levels which could possibly play a role in the disease pathways.

Methods: The data set used in the study was formatted as a matrix which included three groups: 5 control samples (individuals who died naturally), 7 samples from DHF and 12 samples from non-DHF. All the data analysis were done using CRAN package in R and RStudio. Volcano plots were used to determine the differentially expressed genes according to the log2 fold-change and ttest. We used machine learning models, including Naive Bayes, Random Forest, and Logistic Regression, that were trained using confusion matrices and 5-fold cross-validation to classify control and diseased groups. We also used functional pathway enrichment tools, such as EnrichR and StringPPI, for the identification of biological processes and genetic pathways associated with the identified genes. Also, to identify gene expression clustering and the relationships between the sample groups, data visualization methods such as UMAPs (Uniform Manifold Approximation and Projection) and heatmaps were employed.

Results: From the volcano plots, 149 genes were seen to be differentially expressed in the diseased groups. EnrichR analysis revealed that these genes were linked to the pathologies like lipid and atherosclerosis, insulin resistance and several signaling pathways. The UMAP analysis highlighted clear distinction between the control and the diseased groups with some degree of overlap between the DHF and HF groups. The accuracy of the predictive models was as follows: Naive Bayes = 100%, Random Forest = 80% and Logistic Regression = 60%. This is supported by the heat maps, which were similar to the UMAP results, where the genes that were similarly expressed were grouped into one cluster within each group. Further analysis of the regulated pathways showed that there were close interactions between the genes that were highly expressed, which may help to explain the mechanisms of DHF and HF.

Conclusion: Our study provided evidence of 149 genes with altered expression in diabetic heart failure and heart failure and their relationships with the important biological processes. The findings of this study also showed that the machine learning models have a high potential to differentiate between healthy and diseased conditions with the highest accuracy being displayed by the Naïve Bayes model. The results of the bioinformatics analysis, the pathway analyses and the gene mappings enabled the identification of genetic relationships and disease mechanisms and thus will form a basis for future studies.

blog comments powered by Disqus
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.