Healthy cell function relies on well orchestrated gene activity. Via a fantastically complex network of interactions, around 30,000 genes cooperate to maintain this delicate balance in each of the 37.2 trillion cells in the human body.

Broadly speaking, cancer is a disruption of this balance by genetic changes, or mutations. Mutations can trigger over-activation of genes that normally instruct cells to divide, or inactivation of genes that suppress the development of cancer. When a mutated cell divides, it passes the mutation down to its daughter cells. This leads to the accumulation of non-functioning, abnormal cells that we recognise as cancer.

Our laboratory is focused on understanding how one particular cancer – chronic myeloid leukemia or CML – works. Each year more than 700 patients in the UK – and over 100,000 worldwide – are diagnosed with CML. After recent advances, almost 90% of patients under the age of 65 now survive for more than five years.

But in the vast majority of patients CML is currently incurable and lifelong treatment means that patients must live with side effects and the chance of drug resistance arising. With increasing numbers of CML patients surviving (and treatment costing between £40,000 and £70,000 per patient a year), increasing strain is being placed on health services.

A Single Mutation

CML is perhaps unique in cancers in that a single mutation, named BCR-ABL, underlies the disease biology. This mutation originates in a single leukaemic stem cell, but is then propagated throughout the blood and bone marrow as leukemia cells take over and block the healthy process of blood production. The presence of BCR-ABL affects the activity of thousands of genes, in turn preventing these cells from fulfilling their normal function as blood cells.

Drugs that specifically neutralize the aberrant effects of this mutation were introduced to the clinic from the early 2000s. These drugs have revolutionized CML patient care. Many are now able to live relatively normal lives with their leukemia under good control.

But while these drugs kill the more mature daughter cells of the originally mutated leukemia stem cell, they have not fully lived up to their initial billing as “magic bullets” in the fight against cancer. This is because the original “seed” population of leukemic stem cells evade therapy, lying dormant in the bone marrow to stimulate new cancer growth when treatment is withdrawn.

To truly cure CML we must expose, understand the inner workings of, and uproot the leukemia stem cells. And to do this, we need to learn more about them.

How do they survive the treatment that so readily kills their more mature counterparts? Which overactive or inactivated genes protect them?

We believe that the answers to these questions lie in the analysis of biological “big data”. Genome-scale technologies now allow scientists to measure the activity (or “expression”) of every gene in the genome simultaneously, in any given population of cells, or even at the level of a single cell.

Comparison of expression data generated from leukemia stem cells with the same data generated from healthy blood stem cells will reveal single genes or networks of genes potentially targetable in the fight against leukemia.

Big Data To The Rescue

In a project funded by Bloodwise and the Scottish Cancer Foundation, we have created LEUKomics. This online data portal brings together a wealth of CML gene expression data from specialized laboratories across the globe, including our own at the University of Glasgow.

Our intention is to eliminate the bottleneck surrounding big data analysis in CML. Each dataset is subjected to manual quality checks, and all the necessary computational processing to extract information on gene expression. This enables immediate access to and interpretation of data that previously would not have been easily accessible to academics or clinicians without training in specialized computational approaches.

Consolidating these data into a single resource also allows large-scale, computationally-intensive research efforts by bio-informaticians (specialists in the analysis of big data in biology). From a computational perspective, the fact that CML is caused by a single mutation makes it an attractive disease model for cancer stem cells. However, existing datasets tend to have small sample numbers, which can limit their potential.

The more samples available, the higher the power to detect subtle changes that may be crucial to the biology of the cancer stem cells. By bringing all the globally available CML datasets together, we have significantly increased the sample size, from two to six per dataset to more than 100 altogether. This offers an unprecedented opportunity to analyze gene expression data to expose underlying mechanisms of this disease.

As of March 2017, the portal is up and running in the public domain. We are planning to tour Scotland and present at international conferences, aiming to train researchers in how best to exploit this new resource. Ultimately, we hope that this tool will lead to new ideas and approaches, and attract more funding, in the fight against CML.

And while we continue to expand our representation of CML data in real time from research centers all over the world, we also plan to begin incorporating data from other types of leukemia.

In recent years, targeted therapies have become hugely important in cancer research. By providing these data to the CML research community within LEUKomics, we hope to mobilise new research into cancer-causing leukemic stem cells, and ultimately design treatments to target them without affecting healthy cells. Our database provides a critical stepping stone in this process.

Authors: Lorna Jackson, PhD candidate (Paul O’Gorman Leukaemia Research Centre), University of Glasgow and Lisa Hopcroft, Research Associate (Institute of Cancer Sciences), University of Glasgow. Top Image: Junia Melo, Wellcome Images.

This article was originally published on The Conversation.

For future updates, subscribe via Newsletter here or Twitter