"An Extended Atom Type System for Algebraic Graph-Based Machine Learning Model in Drug Design"
Drug discovery is a highly complicated and time-consuming process. One of the main challenges in drug development is predicting whether a drug-like molecule will interact with a specific target protein. This prediction is crucial in expediting the validation and discovery of targets, and it enables biochemists and pharmacists to accelerate the drug development process. In recent studies of biomolecular sciences, the application of algebraic graph-based models to accurately represent molecular complexes and predict drug-target binding affinity has generated significant interest among researchers. Here, we present algebraic graph-based molecular representations to form data-driven scoring functions (SF) named AGL-EAT-Score featuring extended atom types to capture wide-range interactions between the target and drug candidate. Our model applies multiscale weighted colored subgraphs for the protein-ligand complex where the graph coloring is based on SYBYL atom-type and ECIF atom-type interactions. Furthermore, combined with the gradient-boosting decision tree (GBDT) machine-learning algorithm, our newly developed SF has outperformed numerous state-of-the-art models in PDBbind benchmarks for binding affinity scoring power, and the D3R dataset, a worldwide grand challenge in drug design.
Additional authors: Dr. Duc Nguyen, University of Kentucky; Md Masud Rana, University of Kentucky