Posters

Presenting Author

Jannatul Ferdaus

Presenting Author Academic/Professional Position

Graduate Student

Academic Level (Author 1)

Graduate Student

Academic Level (Author 2)

Faculty

Academic Level (Author 3)

Faculty

Presentation Type

Poster

Discipline Track

Other

Bioinformatics

Abstract Type

Research/Clinical

Abstract

Background: Phosphatase-substrate interactions may serve as biomarkers for diseases, offering crucial insights to support early diagnosis. Analyzing these substrate-specific interactions is also crucial in recommending different drugs. In our research, we constructed a knowledge graph that provides a comprehensive network representation, capturing diverse relationships among biological entities relevant to phosphatases.

Methods: Our knowledge graph represents 5 unique nodes- kinase, phosphatase, substrate, drug, and disease. The connections between the nodes are obtained from DrugMap (drug and their target proteins; disease and their associated proteins), STRING protein-protein interaction, DEPOD 2019 (phosphatase and their substrate), PhosphoSitePLUS (kinases and their substrate), and UniProt. As part of the methodology, we first preprocessed the data from different sources and applied the Node2Vec embedding method to find the vector representation of each node and used embeddings to train the model for link prediction between the phosphatases and their substrates. We used five different machine learning models- Logistic Regression, Random Forest, Support Vector Machine, K Nearest Neighbor, and Naïve Bayes for prediction.

Results: Our findings show that Random Forest proves to be the best model suited for our knowledge graph providing accuracy, f1-score, and AUC of 95.23%, 95.18%, and 95.23% respectively. On the other hand, the Naïve Bayes model proves to be the worst model compared to the other models. Moreover, choosing the dimension value as 16 for vector representation proves to be the best fit for our knowledge graph.

Conclusion: By integrating a variety of data sources, we captured the functional relationship and association between phosphatases and their substrates. Overall, our research on Phosphatase-substrate interactions and their link prediction will help in the early diagnosis of different diseases.

Abstract.pdf (55 kB)

Share

COinS
 

Phosphatase-Substrate Prediction using Heterogeneous Knowledge Graph

Background: Phosphatase-substrate interactions may serve as biomarkers for diseases, offering crucial insights to support early diagnosis. Analyzing these substrate-specific interactions is also crucial in recommending different drugs. In our research, we constructed a knowledge graph that provides a comprehensive network representation, capturing diverse relationships among biological entities relevant to phosphatases.

Methods: Our knowledge graph represents 5 unique nodes- kinase, phosphatase, substrate, drug, and disease. The connections between the nodes are obtained from DrugMap (drug and their target proteins; disease and their associated proteins), STRING protein-protein interaction, DEPOD 2019 (phosphatase and their substrate), PhosphoSitePLUS (kinases and their substrate), and UniProt. As part of the methodology, we first preprocessed the data from different sources and applied the Node2Vec embedding method to find the vector representation of each node and used embeddings to train the model for link prediction between the phosphatases and their substrates. We used five different machine learning models- Logistic Regression, Random Forest, Support Vector Machine, K Nearest Neighbor, and Naïve Bayes for prediction.

Results: Our findings show that Random Forest proves to be the best model suited for our knowledge graph providing accuracy, f1-score, and AUC of 95.23%, 95.18%, and 95.23% respectively. On the other hand, the Naïve Bayes model proves to be the worst model compared to the other models. Moreover, choosing the dimension value as 16 for vector representation proves to be the best fit for our knowledge graph.

Conclusion: By integrating a variety of data sources, we captured the functional relationship and association between phosphatases and their substrates. Overall, our research on Phosphatase-substrate interactions and their link prediction will help in the early diagnosis of different diseases.

blog comments powered by Disqus
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.