top of page

Addressing data diversity challenges in genomics: a health economics perspective


Pauline Herscu, James Buchanan, Laurence Roope, Patrick Fahr, Sarah Wordsworth

Just as we, in our daily lives, face the challenge of allocating a finite monthly income across priorities such as food, housing or education, health economics is the discipline evaluating the allocation of scarce resources in healthcare. Attempting to answer the question ‘Is this money well spent?’, health economic evaluations (for example cost-effectiveness analyses) reconcile any changes in costs from introducing a new healthcare intervention with associated changes in health outcomes, and consider whether the ratio of costs to health outcomes indicates cost-effectiveness. Ideally, health economic evaluations should be conducted alongside randomised controlled trials, and should be considered within the design process of these studies. However, these health economic studies often take on secondary importance in relation to the main trial, and are sometimes called “piggyback evaluations”.

Cost-effectiveness is a crucial consideration in the context of genomic testing. Over the past two decades, genomic technologies have evolved from laborious single-gene testing to sequencing the entire genome in one go. For patients with rare genetic diseases, genome sequencing (GS) can help shorten what were previously thought to be never-ending ‘diagnostic odysseys’. In the National Health Service (NHS), the 100,000 Genomes Project (100KGP) was launched in 2013 to build an evidence base to support the use of genome sequencing in routine clinical care for patients with rare diseases and cancer.

From 2015 to 2018, 100,000 genomes from more than 70,000 patients and their relatives were sequenced, providing medical professionals with a trove of genetic and clinical data to inform diagnoses, and providing health economists with an equally rich dataset to evaluate health and cost outcomes. Alongside the 100KGP, health economists at the Health Economics Research Centre (HERC) at the University of Oxford have been undertaking cost-effectiveness analyses to understand the value of implementing GS in the NHS, and to quantify the resulting health outcomes, such as diagnostic yield.

This work evaluates whether patients really benefit from the use of GS in their diagnostic odyssey: do they receive diagnoses, and if so, can these diagnoses effectively improve survival or clinical management? What is the difference in costs incurred by the NHS between patients who do not undergo GS and patients who do?

Although economic evaluation is an important tool for health economists, other tools are also potentially informative in the context of evaluating genomic technologies. Echoing more recent societal concerns, the measurement of inequalities is a growing field in health economics. If a public health programme improves health outcomes on average for a large cohort of patients, can we deem it effective even if not all individuals are likely to benefit?

Focusing on equity, health economists have to adjust the lens of the analysis to distinguish between specific groups of patients: young and old, men and women, different ethnic groups, and different socio-economic layers. The results of these analyses are also potentially informative for public health professionals designing and implementing national programmes. For example, should a public health programme target women, as they were previously less likely to enrol, and consequently less likely to benefit from the programme? If specific ethnic groups exit the programme and report lower health benefits, they could receive improved or longer care in the future, to bring their health outcomes in line with those of other ethnic groups. To decide if these steps are required, researchers must first determine whether an inequality is an inequity – in other words, if an observed difference between groups is unfair and could be avoided.

This is, however, not straightforward. Analysing how socioeconomic factors both influence and are affected by health has often been a blind spot in health studies that were designed to measure biological, clinical, or genetic outcomes, but not social or cost outcomes. This forces health economists studying equity issues to piggyback onto existing studies and use data to define outcomes that can be used to undertake analyses of equity.

The World Health Organisation (WHO) defines the ‘social determinants of health’ as “the non-medical factors that influence health outcomes”. Numerous studies have shown that gender, ethnicity, education level, income and other measures of socioeconomic status (SES) influence health, according to a common gradient: ‘the lower the socioeconomic position, the worse the health’. How does this work? When it comes to health inequities, a common culprit is ‘implicit bias’, defined as “a bias or prejudice that is present but not consciously held or recognized”. Implicit bias is often rife within the health system just as it is in the rest of society, and influences patient-clinician interactions, treatment decisions and treatment adherence, accounting for some of the health inequalities observed. What if health providers consistently downplayed your expression of pain and suggested you received mental health support instead of suggesting a medical solution? This example might sound odd to male readers but female readers may easily remember interactions with health providers that fit this pattern. Patients facing such reactions might go untreated for years, sinking into chronic pain and threatened by dire health outcomes.

Work to understand and quantify equity issues related to the 100KGP is also currently underway at HERC. A key challenge to date has been defining outcomes to measure inequalities. One of the early measures of efficiency for the programme was diagnostic yield, estimated in the early days of the project to be around 22%. Considering this number through the lens of equality, health economists ask further: who are those 22%? Are they equally men and women, Whites and ethnic minorities, patients of all SES? What if inequalities started much earlier, for example at enrolment, resulting in a biased cohort? To answer these questions, it is important to understand who had access to the 100KGP, before measuring inequalities in health and costs: access outcomes, health outcomes and cost outcomes are at the core of our analysis of inequalities in the 100KGP. Once these categories were defined, the task remained to specify outcomes to accurately measure inequality based on the available data.

Defining outcomes requires health economists to balance assumptions about the real world and data availability. The data from the 100KGP is linked to national mortality statistics and secondary care data from hospitals in England, providing a detailed picture of the healthcare utilisation of patients outside the primary care setting. This large dataset, although rich, presents analysts with major challenges. How can we build a measure of health or ill-health that is valid for a cohort of patients with rare diseases, who by definition have extremely uncommon or, in some cases, unique illnesses? How can a researcher, looking at clean data from the comfort of an office chair, begin to grasp the experience of patients who have spent years in hospital prior to receiving a diagnosis?

Reflecting on these questions, we made the assumption that patients who used the healthcare system the most were those most in need of it: the frequency of ‘episodes’, i.e. any use of healthcare resources, was defined as our main health outcome, and the healthcare costs that were accrued as our main cost outcome. The more frequent the episodes, and the higher the costs, the worse the health and cost outcomes. One advantage of these outcomes was that we could use the full range of data for patients affected by very different rare diseases. These outcomes had, however, the same disadvantage: we could not distinguish between diseases requiring different frequency of care and generating different amounts of costs.

By definition, genetic diseases affect men and women, or certain ethnicities unequally. As we were unable to account for these differences in illnesses that could partly or completely explain observed inequalities, it was impossible to assert that these inequalities are necessarily inequities. In this instance, the interpretation of inequality is limited by the nature of a unique cohort of patients, affected by diseases so rare their clinical characteristics are not yet fully known.

Defining outcomes amounts to deciding which questions to ask. Another key challenge health economists face when measuring inequality is deciding about whom to ask the question. To evaluate differences in access to the 100KGP, one solution was to measure the time to enrolment for each patient, i.e. how far along in their diagnostic odyssey patients were when they were offered the opportunity to undergo GS. Asking this question of the data was feasible: the data were available, the statistical approach to be applied was standard and well understood – but which groups of patients should be compared? How should ethnicity or SES be defined?

For outcomes, just as for variables of interest, the challenge of any statistical analysis lies in the discrepancy between the data available to describe the world, and the world itself. For example, measures of SES used in health or sociological studies (such as education, income, or geographical location) can be self-declared or externally assessed. They can also be based on a single scale, for example in pounds for income, or on a more indirect scale. In England, the Office of National Statistics (ONS) uses the Index of Multiple Deprivation (IMD), which ‘measures relative deprivation in small areas in England’ based on seven domains of deprivation: income, employment, education, health, crime, barriers to housing & services, living environment. Scores from these seven domains are aggregated into an overall score for each of the 32,844 small areas in England. This widely accepted measure of SES is obviously ambitious, attempting to summarize seven factors in one score, but also limited: every person living in the same small area will have the same SES status, without consideration of household or individual differences - therefore limiting our analysis of inequalities.

Considering ethnicity, two variables were available: ‘ethnicity’, self-declared by patients according to 17 categories, and ‘ancestry’, genetically inferred from each patient’s individual genome according to 6 categories. According to the latest census, the population of England and Wales is 86% White. It was therefore not a surprise that there were insufficient patients from ethnic minorities enrolled in the 100KGP for us to accurately measure outcomes for each ethnic group. Here again, our original intent to measure inequalities as robustly as possible had to adapt to the available data: what was measurable were inequalities between Whites and individuals falling into the category Black, Asian and Minority Ethnic (BAME). What we lost in detail through this grouping, we gained in certainty: there were enough BAME participants to allow us to detect statistically valid differences in outcomes compared to Whites.

Once ethnicity and ancestry were recoded into Whites and BAMEs, the question remained as to which variable is best to measure inequalities. Genetically inferred ancestry seems at first to be more accurate than self-declared ethnicity – but can genetic ancestry always predict ethnicity, as it is seen by the patient and by society? Implicit bias is a reaction to a perceived ethnicity, which truly is a ‘social determinant of health’. On the other hand, ancestry could help us understand biological aspects of inequalities, as mentioned above, since certain ethnicities can be affected by different illnesses, resulting in different health or cost outcomes.

Choosing one variable over the other was already taking a stance: choosing ethnicity suggested our assumption that observed inequalities are indeed inequities, fed by implicit bias; choosing ancestry suggested a different assumption, that observed inequalities could at least partially be linked to differences in the rare diseases affecting certain ethnicities. Our solution was to run each model twice, once with the ethnicity variable, once with the ancestry variable. It has been shown that implicit bias is based on perceived ethnicity, which should be the preferred variable to analyse how individual health is affected by any form of discrimination. However, a cohort of patients with rare diseases cannot be treated as a cohort of patients affected by the same disease. Data on healthcare utilization for each rare disease is lacking here to help us understand whether observed inequalities are indeed inequities.

If we wish to give inequities a major focus in the new NHS Genomic Medicine Service (GMS), improved measurement of inequalities is crucial, which will likely require an expansion in data collection. Additional data will help us to understand differences in health outcomes not only between Whites and BAMEs, but also between each ethnic group in England. Different data, explicitly designed for the purpose of measuring inequities, will ensure that such analyses do not need to piggyback on studies that were only really designed to measure clinical or medical outcomes, but are established as a necessary object of our attention to how health is affected by its social determinants. Via the 100KGP and now the GMS, the NHS is the first national health care system to offer genome sequencing as part of routine care, cementing the position of the UK as a world leader in this context. It is important that we leverage this leadership going forward to address data diversity challenges, and health economics tools and methods will be crucial in achieving this objective.


Pauline Herscu is an epidemiologist and health economist by training. She completed her master's thesis at the Health Economics Research Center (HERC) in Oxford University where she could conduct the first analysis of equity within the 100,000 Genomes Project, a health program from the National Health Service (NHS). Following her graduation with an MSc in Epidemiology from the Ludwig-Maximilians University in Munich, she now works as a consultant in Epidemiology, Health Economics and Market Access.

bottom of page