Multi-Drug Resistance in Cancer: Data Analysis for P-glycoprotein Inhibition

By | October 25, 2017

Drug efflux and P-glycoprotein

Over time, cancerous cells have the ability to become resistant to treatments such as chemotherapy through several mechanisms. One of them is drug efflux, an otherwise useful body mechanism that protects normal cells by ensuring the intracellular drug concentration stays below a cell-killing threshold. Drug efflux is possible thanks to proteins called transporters, whose function is to transport a variety of substances across cellular membranes.

However, in the case of cancer, drug efflux can become an issue: some transporters are able to carry chemotherapy drugs out of the cancerous cells, thus protecting the cells from being destroyed by the medication. The P-glycoprotein, or P-gp, is one of these transporters. What makes P-gp special is that it can transport many kinds of substrates, which is why an overexpression of P-gp causes multidrug resistance in cancerous cells.

P-gp, like all transporters, is an efflux pump situated on a cell’s membrane. Two kinds of binding domains can be found on P-gp: the Nucleotide Binding Domains, or NBDs, and the Drug Binding Domains, or DBDs (also called TransMembrane Domains, or TMDs). The role of the NBDs is to consume energy by absorbing an energy storage molecule, the Adenosine TriPhosphate, or ATP. This energy is used to power the DBDs, whose role is to pump drugs out of the cell.

Figure 1: P-glycoprotein (original picture from

When P-gp expulses drugs out of the cell, it rotates on itself in order to reposition its DBDs and NBDs. This rotation is called P-gp’s catalytic transition cycle. In the initial structure of the protein, the two NBDs (in yellow on the figure) are open to the energy molecules ATP. The DBDs (in blue and green on the figure) are open to the inside of the cell and ready for a substrate to bind to P-gp. When this happens, both the NBDs and the DBDs close in order to hold the ATP and the substrate inside the protein. The substrate is then pushed out of the cell when the DBDs open to the extracellular space. Finally, the energy stored in the ATP molecules is consumed so that P-gp can go back to its initial state. Each structure of the protein during this process is called a pose.

Figure 2: P-glycoprotein’s catalytic cycle

Combination drug therapy

An idea to sensitize cancerous cells to the drugs is to use combination drug therapy. This method consists in using inhibitory drugs to block the transporters’ activity, so that the drugs can remain inside of the cancerous cells long enough to destroy them. It involves looking for candidate molecules capable of blocking a target protein (for example, P-gp). These candidate molecules are those that show strong binding activity towards the target protein.

A chemical compound that binds to the NBD hinders its activities by blocking the access for the energy molecules, thus preventing P-gp from consuming energy. A compound that binds to the DBD can be transported out of the cell by Pg-p. In order to prevent drug efflux, an ideal compound is thus one that binds strongly to the NBDs but weakly to the DBDs.

The role of data analytics

High-Throughput Screening, or HTS, consists in the screening of an entire compound library against the target protein: it is a “real screening” method. HTS allows scientists to test hundreds of thousands of compounds a day by using complex laboratory automation. However, performing HTS is expensive. The solution to this problem is to perform “virtual screening” before performing HTS. Virtual screening is cheaper than real screening, because it does not require access to a laboratory and because it allows compounds that have not been purchased or synthesized yet to be tested. It cannot replace real screening, but it helps to reduce the number of compounds to be tested by real screening. Virtual screening can be performed by software such as AutoDock Vina.

Both the virtual screening and the real screening of compounds generate a large amount of data. While it would be possible to manually select the best compounds among a few tested compounds, this approach seems unconceivable when we know that compound libraries can contain dozens of thousands of compounds. By using data analytics methods to perform a pertinent selection, biologists can save time and money.

The process of identifying promising compounds

The first step towards accomplishing this is to perform virtual screening on a large compound library. The result of this screening is a set of features, such as the binding strength to some specific areas of the target protein, and the value each compound takes for each feature – according to the virtual testing.

A list of promising compounds can be manually extracted from the result of the virtual screening: this selection is done by biologists using their professional knowledge. For example, compounds that seem to bind exceptionally strongly to the NBDs can be selected because they are believed to be effective against P-gp, for the reasons detailed previously. These promising compounds can then be purchased in order to perform real screening and obtain more reliable values and more features, such as the actual efficacy of the compounds against P-gp.

This is where data analytics come into play. The goal here is to build a reliable model, capable of predicting the efficacy of a compound against P-gp using the information contained in the other features. Finally, this model can be applied to the dataset originally obtained by virtual screening. Instead of selecting promising compounds by hand, the biologists can make a decision supported by the model, which predicts as accurately as possible which compounds might be the most effective against P-gp.

Figure 3: Steps of the process


  1. A Chapron, O.L., Introduction to Drug Discovery. Combinatorial Chemistry Review, 2004-2016.
  2. JP Hughes, S.R., SB Kalindjian and KL Philpott, Principles of early drug discovery. British Journal of Pharmacology, 2011. 162: p. 1239-1249.
  3. G Housman, S.B., S Heerboth, K Lapinska, M Longacre, N Snyder, S Sarkar, Drug Resistance in Cancer: An Overview. Cancers, 2014. 6: p. 1769-1792.
  4. Luqmani, Y., Mechanisms of Drug Resistance in Cancer Chemotherapy. Medical Principles and Practice, 2005. 14: p. 35-48.
  5. Wen Li, H.Z., Yehuda G. Assaraf, Kun Zhaod, Xiaojun Xue, Jinbing Xie, Dong-Hua Yang, Zhe-Sheng Chen, Overcoming ABC transporter-mediated multidrug resistance: Molecular mechanisms and novel therapeutic drug strategies. Drug Resistance Updates, 2016. 27: p. 14-29.
  6. Naqa, I.E., Perspectives on making big data analytics work for oncology. Methods, 2016. 111: p. 32-44.