Hepatocellular carcinoma (HCC) is a liver cancer that arises from abnormal proliferation of the hepatocytes and accounts for 80% of primary liver cancer cases . HCC is recognized as a significant cause of cancer-related deaths worldwide and has been a healthcare burden globally . The major causes of the infection include endemic hepatitis B and C viruses, alcohol, non-alcoholic fatty liver diseases, and exposure to aflatoxin [3-5]. The early stage of HCC is often managed via surgery, radiofrequency ablation, and liver transplant. However, the asymptomatic nature of the disease at its early stage and the dearth of effective systemic treatments in the advanced stage have led to unfavorable treatment outcomes [6-8].
The emergence of chemotherapeutic agents targeting specifically the signaling pathways provide a new era of selective chemotherapeutic hits that could violently attack cancerous cells and the tumor microenvironment that, causes tumor proliferation with greater efficacy and low side effects on the normal cells [9,10]. An essential signaling pathway is the tyrosine kinase (T.K.) Epidermal Growth Factor Receptor (EGFR) regulates cell growth, survival, multiplication, differentiation, and apoptosis . EGFRs are over-expressed in many human cancer cells, HCC inclusive [12-14], making this macromolecule a promising target in the rational search for novel chemotherapeutic agents.
The significant challenges militating against the discovery and development of novel drugs include the enormous time and resources expended from the discovery stage to the preclinical and clinical levels. However, applying Computer Aided Drug Design techniques such as quantitative structure-activity relationship (QSAR) modeling, molecular docking, drug-likeness evaluation, and ADMET prediction has helped to mitigate these challenges recently. QSAR modeling provides the mathematical link between the biological properties of molecules and their descriptors. The information garnered from the model could then be used to optimize the potencies of the molecules. Also, Molecular docking is a technique that helps to ascertain the binding affinities of therapeutic compounds to the active sites of a protein target [15,16]. ADME/T, on the other hand, ADME/T defines the fates of bioactive compounds in the biological system with specific reference to their absorption, distribution, metabolism, excretion, and toxicity. The success of a drug in clinical trials is greatly dependent on these properties. Hence, computer-aided ADME/T profiling of drug candidates is essential to modern drug design. Early prediction of these properties in drug candidates helps minimize the attrition rate in drug development [17-23].
Organic compounds, especially those containing benzimidazole moiety, have been identified as an essential pharmacophore in the discovery of novel anti-tumor agents [24-28]. This study aims to apply computer-aided techniques to design highly potent and less toxic benzimidazole-chalcone-based chemotherapeutic agents against HCC by targeting the EGFR receptor. This enzyme governs the growth and dissemination of tumor cells. In achieving the stated aim, a literature search was conducted for benzimidazole-chalcone analogs with proven anti-cancer activities against HCC. The bioactive compounds were subsequently subjected to QSAR modeling, molecular docking simulation, design, drug-likeness, and ADMET profiling.
A dataset of twenty four (24) benzimidazole-chalcone derivatives with proven anti-cancer activities against the HepG2 cancer line was accessed from reported work . Meanwhile, the reported activities of these series of compounds were transformed from IC50 (µM) to pIC50 using the expression; pIC50 = -log IC50 to make their biological activity values fall within statistically normal distribution [30,31]. The bioactive compounds' 2D chemical structures, pIC50 values, and IUPAC nomenclatures are presented in Table S1 as a supplementary file. The geometries of all the molecules were optimized to determine their stable conformations using the DFT (B3LYP/6-31G*) method of Spartan 14 software. PaDEL Descriptor tool was used to calculate nearly 2045 pool descriptors for all the optimized structures. Filtration of the calculated descriptors was carried out with V-WSP data pretreatment tool 1.2v to eliminate descriptors with zero and redundant values. Subsequently, the filtered descriptors were partitioned into a 70% training set for model development and a 30% test set for external validation of the model by using Kennard and Stone's algorithm of the DTC laboratory. The QSAR model was built using the Genetic Function Approximation (GFA) tool of Material Studio v8.0 by setting mutation probability to 0.3 [32,33].
The generated QSAR model was validated using the following internal validation parameters; Friedman Lack of fit (LOF), square of correlation coefficient (R-squared), Adjusted R-squared, Cross validated R-squared (Q2cv). Likewise, external validation of the QSAR model is crucial to obtain models with more reliable predictions. The optimum QSAR model was externally validated using the test set of eight molecules by using the model to predict the pIC50 of the compounds .
Model Applicability Domain
Applicability Domain (A.D.) refers to the physicochemical, structural, or biological space in which the training set of the model has been developed. The resulting model can be reliably applicable only to those compounds which are inside this domain. It helps to estimate the uncertainty in predicting the bioactivities of a congeneric set of molecules. The AD of the optimum model was defined using the standardization approach in A.D. executable jar file of the DTC laboratory at http://poi.apache.org .
The primary essence of this is to develop analogs of benzimidazole-chalcones with enhanced potencies and excellent ADMET properties. In achieving this feat, compounds 9 and 13 were selected as templates for designing new analogues because they fall within the model's AD and display the best inhibitory activity against the HepG2 cell line. Optimization of the potencies of the templates was achieved via their structural modifications guided by the dominant descriptors in the model, giving rise to newly designed ligands. The IC50 values of these new bioactive analogs were predicted using the optimum QSAR model.
Molecular Docking Simulation
To substantiate the potencies of the designed ligands and the templates against the HepG2 cell line, they were subjected to molecular docking simulation against EFGR protease to examine their binding affinities and mechanisms of interaction with the active sites of the protein target. The geometry-optimized structures of the compounds were prepared using the AutoDock Vina interface and saved as pdbqt files . The 3-D crystal structure of EGFR kinase in complex with allosteric inhibitor (PDB Code: 7JXQ) was retrieved from the RCSB protein data bank at www. rcsb.org/pdb. The protein was prepared on the Discovery Studio interface, where water molecules and heteroatoms were removed. Using the AutoDock Vina interface, missing atoms were checked and repaired, partial atomic charges were assigned, and all necessary valency tests and H-atom additions were performed on the prepared protein. EasyDock, a graphical user interface of AutoDock Vina virtual screening software, was used to carry out the docking and the grid box revealed the following binding locations in EGFR; center-x = 48.2036; center-y = 0.0159; center-z = 70.1600; Dimension-x = 59.5890; Dimension-y = 54.6382; Dimension-z = 88.8233. The Discovery Studio Visualizer v126.96.36.19950 was deployed to analyze the kind of interactions in the stable protein-ligand complexes formed. Likewise, the chemical structure of Doxorubicin, a validated anti-cancer drug used as a reference for comparison in this study, was geometry-optimized, prepared, and docked with EFGR protein target to examine its binding affinity to the target [37,38].
Drug-Likeness, Pharmacokinetics, and Toxicity Assessment
Drug-likeness evaluation was performed on the designed compounds to ascertain their oral bioavailability. The Lipinski's and Veber's rules were used to evaluate this critical parameter. The former rule states that for a drug candidate to be orally bioavailable, it must not violate more than one of the following indices; molecular weight (Mw) < 500, number of hydrogen bond donors (HBD) < 5, octanol/ water partition coefficient (Log P) < 5 and number of hydrogen bond acceptors (HBA) < 10 . The Veber’s rule, on the hand, states that for a drug candidate to be orally bioavailable, its topological polar surface area (TPSA) and the number of rotatable bonds (NRB) must be less than 140 Å2 and 10, respectively. The aforementioned physicochemical properties of the molecules were calculated with the aid of the SwissADME (www.swissadme.ch/) online tool and the DataWarrior chemoinformatic program.
Likewise, pharmacokinetics deals with the study of the fates and distribution of pharmacological compounds in the biological system with particular reference to their absorption (A), distribution (D), metabolism (M), and excretion (E). Early ADME and toxicity (T) assays of potential drug candidates are necessary to minimize their failure rates due to weak pharmacokinetic and toxicity profiles during clinical trials. The newly designed ligands were analyzed for their ADMET properties with the aid of Swiss ADME server (http:// www. swiss adme. ch/ index. Php) and DataWarrior chemoinformatic program.
GFA-QSAR Model and Validation
It is an established paradigm in Chemistry that whatever property a molecule exhibit is a function of the descriptors of its structure. QSAR model links the descriptors of a molecule to its biological properties using a regression equation. The GFA-derived QSAR model that connects the anti-cancer properties of the studied bioactive compounds to their molecular descriptors is shown by equation 1. The validation parameters for the regression model and the definitions of the descriptors in the model are presented in Tables 1 and 2, respectively. Table 3 presents the correlation matrix for all the descriptors. Furthermore, Figure 1 presents the plot of experimental pIC50 versus predicted pIC50 for training and test set molecules, as well as the residual plot of the QSAR model.
pIC50 = 0.1188 * ZMIC2 − 0.2277 * RDF120m − 5.4171 (1)
Table 1. Validation Parameters of the Model
Figure 1. The plot of experimental pIC50 verses predicted pIC50 for the training set (A), test set (B), and the residual plot (C) of the model
Table 2. Symbols, definitions, and classes of the descriptors in equation 1
Table 3. Correlation Matrix of Descriptors
Figure 2. 2D chemical structures of the designed ligands and standard Doxorubicin ligand ( H-1: (E)-3-(4-chloro-2-cyclopropylphenyl)-1-(1-(2-(pyrrolidin-1-yl)ethyl)-1H-benzo[d]imidazol-2-yl)prop-2-en-1-one, H-2: (E)-1-(4-(3-chlorocyclopenta-1,3-dien-1-yl)-1-(2-(pyrrolidin-1-yl)ethyl)-1H-benzo[d]imidazol-2-yl)-3-(4-chlorophenyl)prop-2-en-1-one, H-3: (Z)-3-(4-chlorophenyl)-3-fluoro-1-(7-fluoro-4-nitro-1-(2-(pyrrolidin-1-yl)ethyl)-1H-benzo[d]imidazol-2-yl)prop-2-en-1-one )
Molecular Docking Simulations of the designed compounds
The binding affinities (ΔG) of the novel ligands and Doxorubicin against the active sites of the EGFR are presented in Table 4. Their 2D diagrams of interactions with the protein target are displayed in Figure 3.
In-silico Drug-likeness and ADMET Assay
The key physicochemical descriptors of the novel ligands used to assess their drug-likeness and ADMET profiles are presented in Tables 5 and 6, respectively.
Table 4. Predicted IC50 and binding affinity values of the designed ligands
Figure 3. 2D diagram of interactions of the designed ligands and Doxorubicin with the active sites of EGFR protease
Table 5. Physicochemical properties of the compounds.
Table 6. Pharmacokinetics and Toxicity profiles of the Compounds
Equation 1 gives the GFA-derived QSAR model, linking the dominant descriptors of the investigated benzimidazole-chalcones to their anti-cancer properties. The model's statistical parameters comply with the standard validation metrics shown in Table 1, confirming its stability, robustness, and reliability [40,41]. Likewise, the high linearity of the plot of experimental pIC50 against predicted pIC50 for the training set of molecules (Figure 1) further substantiates the high extrapolative power of the model.
As a way of externally validating the model, it was used to predict the pIC50 of the test set of molecules that were not used during the model building. The high linearity of the plot of their experimental pIC50 values against predicted pIC50 values (R2pred = 0.81), as displayed in Figure 1 implies that the model can accurately predict a new set of chemicals within its applicability domain. Furthermore, possible biases that may arise in the model-building processes were checked via the plot of standardized residuals against the experimental pIC50 values (residual plot). The dispersal of residuals on both sides of the zero line (Figure 1) implies that there is no systematic error in the course of building the model [42,43].
Multi-collinearity occurs in a model building if the descriptors are significantly inter-correlated. This scenario weakens the statistical power and stability of a model. The presence of possible multi-collinearity among the descriptors in the regression model was checked via the correlation matrix in Table 3. The low inter-correlation (R2 < 0.5) among the two descriptors confirms the high statistical power of the model.
Importance of the descriptors in the Model
One of the fundamental functions of QSAR modeling is to harness the dominant descriptors of a set of molecules responsible for their observed bioactivities. The optimum model in Equation 1 reveals the dominant influences of ZMIC2 and RDF120m descriptors on the observed chemotherapeutic properties of benzimidazole chalcones against the HepG2 cancer cell line. The positive coefficient of the ZMIC2 descriptor, as shown in Equation 1 implies that the anti-cancer properties of the studied compounds vary directly with the value of this descriptor in the molecules. Thus, for enhanced potency, the structures of the bioactive compounds ought to be optimized to have a higher value of the ZMIC2 descriptor. Also crucial to the observed anti-tumor properties of the studied compounds is the RDF120m descriptor. Its negative coefficient in the model reveals the inverse relationship between this descriptor and the anti-cancer properties of the studied benzimidazole-chalcone compounds. Consequently, for enhanced anti-cancer bioactivity of the compounds, this descriptor's value has to be as low as possible.
Molecular docking simulation
EGFR is a target of many chemotherapeutic agents because it is over-expressed in a considerable number of human cancer cells, HCC inclusive. Here, molecular docking analysis of the designed ligands was performed to examine their binding patterns with the active sites of EGFR protease and compare them with the template molecules and the standard Doxorubicin inhibitor. The result of the docking simulation displayed in Table 4 reveals that the newly designed compounds (Figure 2) have minimum binding energy ranging from –8.1 to –8.9 kcal/mol, with the best result achieved with H-2 (ΔG = -8.9 kcal/mol). All the compounds displayed better binding efficacies compared to the template molecules (ΔG = -7.9 kcal/mol each) and Doxorubicin (ΔG = -7.9 kcal/mol).
The 2D diagrams of the interaction of the designed ligands and the standard drug within the binding pocket of EGFR shown in Figure 3 reveal that H-1 binds to the active sites of the protease through pi-sigma interactions with GLY721 and VAL726, pi-anion interactions with ASP855, alkyl interactions with CYS797, ARG841, LEU799, LEU844, LEU792, LEU718, and ALA743, a pi-alkyl interaction with LEU844, an amide-pi stacked interaction with SER720, and pi-donor hydrogen bond with ALA722. Also, the following interactions were found in the complex of H-2 with EGFR; pi-sigma with GLY721 and LEU844, alkyl with ALA722, ALA743, PHE723, LEU718, and MET790, pi-alkyl with VAL726, CYS797, MET790, LYS745, and ALA743, and a pi anion with ASP855. Likewise, H-3 interacted with the active sites of EGFR through pi-alkyl with ARG748, LEU747, LEU861, ILE759, and LYS860; a pi-anion with GLU749; halogen type with PHE 723, and GLU753; and a pi-pi T-shaped with PHE723. Furthermore, the standard ligand, Doxorubicin, binds to the active pocket of the macromolecule via pi-alkyl interactions with ARG748 and LEU861; a pi-pi T-shaped interaction with PHE723; a pi-sigma interaction with LEU861, and pi-anion interactions with GLU749. A conventional hydrogen bond stabilized these interactions with the ALA750 amino acid residue of the protease. By binding to EGFR active sites via pi-sigma, pi-anion, pi-alkyl, and pi-pi T-shaped interaction types, the standard drug exhibits similar mechanisms with the designed ligands. All the designed ligands showed better binding affinities when compared with the control (Doxorubicin). These results are similar to the findings of Al-Anazi et al. , wherein the authors carried out molecular docking investigations on a series of designed chalcone compounds against the active binding sites of EGFR, and docking scores of some of the designed ligands were found to be better than that of TAK 285 used as control.
Drug likeness is the quantitative description of the oral bioavailability of drugs. The evaluation of the oral bioavailability of drug candidates is sacrosanct because a majority of drugs are administered via the oral route. This examination was carried out on the designed ligands using the Lipinski's and Verber’s rule. The results (Table 5) reveal that the compounds violate none of the rules and, as such, could be inferred that they possess excellent oral bioavailability.
Furthermore, poor pharmacokinetics and high toxicity account for the humongous attrition rates in drug development. Thus, the need to profile bioactive compounds concerning these parameters is a rational drug design strategy. The fates of the designed ligands in the biological system concerning their absorption, distribution, metabolism, excretion, and toxicity were assessed via in silico pharmacokinetic and toxicity profiling. The results (Table 6) show that they possess good gastrointestinal absorption. Also, the blood vessels that vascularize the central nervous system possess a unique feature known as the blood-brain barrier (BBB) that regulates the movement of ions, molecules, and cells between the blood and the brain . The BBB protects the Central Nervous System (CNS) from toxins and xenobiotics. The results (Table 6), as it concerns BBB penetration potentials of the ligands, reveal that only H-1 ligand could penetrate the BBB, unlike H-2 and H-3. It implies that H-1 can affect the CNS if administered as a drug, unlike H-2 and H-3, which pose no significant threat to the CNS. Furthermore, P-glycoprotein (P-gp), also known as multidrug-resistant protein, is one of the proteins expressed naturally in many human body tissues that functions as a biological barrier by extruding toxins and xenobiotics from the system . Substrates of P-gp have reduced serum concentrations due to the efflux action of this protein. The newly designed ligands except for H-3 (Table 6), were found to be substrates of P-gp. It implies that the efficacies of H-1 and H-2 may be threatened if used as anti-cancer agents due to possible reduction in their serum concentrations by the efflux action of P-gp.
Also, biotransformation of drugs into inactive metabolites and their subsequent excretion through the kidney and bile is a cardinal aspect of pharmacokinetic studies. This function is performed mainly by Cytochrome P450 (CYP450) enzymes mainly found in the liver and in lesser amounts in the small intestine, lungs, placenta, and kidneys . All the designed anti-cancer agents, except H-1 (Table 6), were found to be substrates of CYP450 enzymes. It implies that these ligands if used as drugs, may not pose any significant toxicity threat to the biological system as they could easily be metabolized and excreted from the body. Additionally, the insilico toxicity assays of the designed compounds (Table 6) reveal that they exhibit little to no tendencies to cause cancer, genetic deformity, reproductive ill-health, and irritation.
Hepatocellular carcinoma is one of the leading causes of cancer-related deaths globally. Lack of potent drugs for managing this disease, especially at its advanced stage, presents a serious challenge to effectively treating this infection. In the search for novel chemotherapeutic agents against hepatocellular carcinoma, a set of benzimidazole–chalcones with established inhibitory activities against HepG2 cancer cell line was subjected to GFA-QSAR modeling to harness the dominant descriptors of the molecules responsible for their anti-cancer activities. The well-validated model (R2 = 0.93, R2Adj = 0.92, Q2LOO = 0.91, R2Pred = 0.81) reveals the predominant influences of ZMIC2 and RDF120m descriptors in the observed anti-cancer bioactivities of the compounds. With the information garnered from the QSAR model, three potent drug candidates were designed using the most active molecules in the data set as templates. The designed molecules were subjected to molecular docking simulation against the active sites of EGFR protease. They were found to be more potent than the templates and the approved chemotherapeutic agent used herein for quality assurance. The novel ligands bind to the active sites of the target EGFR protease with binding affinity values ranging from -8.1 kcal/mol to -8.9 kcal/mol as against the -7.9 kcal/mol and -7.6 kcal/mol recorded for the templates and the Doxorubicin drug, respectively. Drug-likeness and ADMET evaluation of the designed compounds reveals that they possess good oral bioavailability and favorable pharmacokinetic and toxicity profiles. It is envisaged that the wealth of information in this study could be of immense help in the search for potent and non-toxic chemotherapeutic agents against hepatocellular carcinoma.
The authors gratefully acknowledged the technical support of Mr. Israel Emmanuel Edache of the Department of Pure and Applied Chemistry, University of Maiduguri, Maiduguri, Nigeria.
The authors reported no potential conflict of interest.
J. P. Ameji : 0000-0001-5516-8802