Logo image
Machine Learning for the Discovery of DNA‐binding Proteins in Plants
Book chapter

Machine Learning for the Discovery of DNA‐binding Proteins in Plants

Upendra Kumar Pradhan, Prabina Kumar Meher and Pushpendra Kumar Gupta
Bioinformatics for Plant Research and Crop Breeding, pp.299-319
John Wiley & Sons, Ltd
2024

Abstract

deep learning DNA‐binding proteins machine learning numeric features plant
DNA‐binding proteins (DBPs) represent a broad group of proteins, which interact with DNA and play key roles in a wide range of biological processes, such as DNA replication, recombination, repair, and regulation of gene expression. Each DBP contains a DNA‐binding domain that is essential for the physical interaction between a protein and the DNA. Identification of DBPs and understanding the mechanism of their interaction with DNA is essential for the understanding of a variety of activities involving the binding of these proteins with DNA. In order to identify DBPs, a variety of experimental approaches are available, which include X‐ray crystallography, chromatin immunoprecipitation (ChIP), electrophoretic mobility shift assay, and “yeast one‐hybrid system.” Since these techniques are cost‐ineffective and time‐consuming, computational methods have been developed for the identification of DBPs; these methods are broadly classified into the following three major classes: (i) sequence‐based methods, which detect DBPs using sequence‐derived features such as Position Specific Scoring Matrices (PSSM), (ii) structure‐based methods, which utilize information collected from protein structures, and (iii) methods based on physico‐chemical properties and amino acid composition of the proteins. In this chapter, we discuss different machine learning‐based computational methods for the identification of DBPs in plants. The features that are used to map the amino acid sequences onto numeric feature vectors and the available tools/strategies for improvement of the prediction accuracy will also be described. The contents of this chapter should prove useful for biologists, who want to work on the development and use of computer algorithms for prediction of DBPs in plants.

Details

Metrics

26 Record Views
Logo image