Dental Tribune India

“Designed an open- source software for Post- Translational Modifications database” Dr Sachin Gavali

By Rajeev Chitguppi, Dental Tribune South Asia
May 18, 2020

Researchers from Wu Lab, University of Delaware have built a database of post-translational modifications of proteins in their lab. Now, to make it useful for other researchers, one has to actually build a software that can extract data from this database and transform it into a suitable format to be included in the bioinformatics pipelines. Dr Sachin Gavali, who is currently pursuing his PhD in Bioinformatics in the USA has built that software too. Also, he has published it as open-source, which means the source code is available to the general public for use for any (including commercial) purpose, or modification from its original design. In this communication, Dr Sachin summarizes the research work.

Biological systems synthesize proteins through a two-step process that involves transcription and translation of genes. Post-translational modifications (PTM) of these synthesized proteins play a key role in increasing their diversity and utility. For example, the human genome has 20,000 to 25,000 genes but the resulting proteome is estimated to have over several billion proteins [1]. Accurate identification and characterization of these PTMs are essential for a better understanding of underlying cellular processes and the development of novel drug therapies [2].

Wu Lab team that was instrumental in developing the original iPTMnet database

To this goal, Wu lab lead by Dr Cathy Wu at the University of Delaware [3] has developed iPTMnet for PTM knowledge discovery [4]. We have employed advanced techniques such as text mining, data mining, and ontological representations to extract PTM information from existing literature and curated databases. The resulting database has a catalogue of 700,000 unique PTM sites across 63,000 proteins.

Currently, this data can be accessed from the iPTMnet website (https://research.bioinformatics.udel.edu/iptmnet/) which provides various features such as searching and browsing of the PTM data, batch retrieval of enzymes, and protein-protein interactions, an integrated protein sequence alignment viewer, and a Cytoscape network view to visualize the interactions among these proteins. But we found that it was difficult for biologists to integrate this data into large scale bioinformatics analysis as there was no way to automate the access to this data. Hence, to facilitate easy access to the data and to streamline the integration of this data into existing bioinformatics pipelines we have developed a RestFul API [5] for iPTMnet.

The API provides a set of well-defined functions to access the data underlying every possible view on the iPTMnet website [6]. Further, many biologists are proficient in the use of Python and R, and hence we have developed Python and R client packages that handle all the technical details of using the API such as communicating with the API server, decoding the data, and then transforming it into the format requested by the user [7], [8].

We believe that the RestFul API for iPTMnet will be a useful addition to the scientific community. It has the potential to benefit both the researchers by making the data more accessible and the developers by making it easy to maintain the underlying tools and data.

 

References

[1]   L. M. Smith, N. L. Kelleher, and Consortium for Top-Down Proteomics, “Proteoform: a single term describing protein complexity,” Nat. Methods, vol. 10, no. 3, pp. 186–187, Mar. 2013, DOI: 10.1038/nmeth.2369.

[2]   T. M. Karve and A. K. Cheema, “Small Changes Huge Impact: The Role of Protein Posttranslational Modifications in Cellular Homeostasis and Disease,” J. Amino Acids, vol. 2011, pp. 1–13, 2011, DOI: 10.4061/2011/207691.

[3]   “Cathy Wu, Ph.D.,” CBCB. https://bioinformatics.udel.edu/people/personnel/cathy_wu/ (accessed May 17, 2020).

[4]   H. Huang et al., “iPTMnet: an integrated resource for protein post-translational modification network discovery,” Nucleic Acids Res., vol. 46, no. D1, pp. D542–D550, Jan. 2018, DOI: 10.1093/nar/gkx1104.

[5]   “Fielding Dissertation: CHAPTER 5: Representational State Transfer (REST).” https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm (accessed May 17, 2020).

[6]   S. Gavali, J. Cowart, C. Chen, K. Ross, C. N. Arighi, and C. H. Wu, “RESTful API for iPTMnet: a resource for protein post-translational modification network discovery,” Database, 2020, DOI: 10.1093/database/baz157.

[7]   “iPTMnetR.” https://udel-cbcb.github.io/iPTMnetR/#/ (accessed May 17, 2020).

[8]   “PyiPTMnet.” https://udel-cbcb.github.io/pyiPTMnet/#/?id?pyiptmnet (accessed May 17, 2020).

 

Author:

Dr Sachin Gavali is a third-year PhD student at the University of Delaware, Newark, DE, USA, studying Bioinformatics and Data Science under the advisement of Dr Cathy Wu. He is also working as a Research Assistant, Department of Computer and Information Science, University of Delaware, Newark, DE, USA

Sachin presenting his work at the Association for Computing Machinery Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB) 2019 held at Washington DC.

Currently, he is working on developing computational methods to study drug abuse prevalence in society. More specifically he is working on utilizing machine learning and artificial intelligence to develop predictive models for stratification of the population that might be at risk of developing drug addiction and death due to drug overdose. In the future, he plans to generalize his computational methods to work with heterogeneous data to extract knowledge out of it, which can be used to combat a range of diseases from genetic such as Alzheimer's to societal such as Addiction.

Sachin, who was a professional app developer from his BDS days (Terna Dental College, Navi Mumbai), developed Habithub (https://www.thehabithub.com) which is a personal productivity application to track your daily schedule and habits. Everyone needs some inspiration to do great things, and Sachin was inspired after reading a blog post on Lifehacker  (https://lifehacker.com/jerry-seinfelds-productivity-secret-281626). He published the app on the android app store in 2014 and since then the app has been downloaded 500,000 (500K) times. The app is currently installed on 103,200 android devices worldwide and 15,000 people use the app daily.

In the future, he plans to expand the app to multiple platforms such as iOS, Windows, Mac, and Linux. So no matter which device you use, you will be able to stay productive. He also plans to inculcate the knowledge that he has acquired during his PhD towards improving the analytical and predictive features in the app.

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2020 - All rights reserved - Dental Tribune International