EU Pharma Survey Spotlights Big Data Challenges

Big data is being increasingly adopted by pharmaceutical companies and contract research organizations. This conclusion is borne out by a Fall 2018 survey conducted by the European Medicines Agency’s (EMA) Heads of Medicines Agencies (HMA)–EMA Joint Data Taskforce. The EMA is the EU’s regulatory agency for approval and monitoring of medicines as well as policy development. The HMA is made up of National Competent Authorities, the organizations responsible for medical product regulation in the European Economic Area.

The survey is part of a summary report published last month. The report addresses how big data will affect the regulatory needs and responsibilities of regulatory agencies in the areas of medicine and health, and summarizes recommendations by the six HMA-EMA subgroups (genomics, bioanalytical ‘omics [focused on proteomics], clinical trials, observational data, spontaneous adverse drug reactions, and social media and mobile-health data). Part of the report, each group’s priority recommendations are listed at the end of this blog.

  • Definition of big data used by the report: “[E]xtremely large datasets which may be complex, multi-dimensional, unstructured and heterogeneous, which are accumulating rapidly and which may be analyzed computationally to reveal patterns, trends and associations. In general, big data sets require advanced or specialized methods to provide an answer within reliable constraints.”

To record the current state of big data’s applications, the Taskforce conducted two surveys: one addressed National Competent Authorities’ big data capabilities (24 responded), and one examined how big data could be and is now used by pharmaceutical companies for drug development (37 respondents). Half of all respondents to the second survey describe themselves as involved in the “development, production or marketing of pharmaceuticals.”


Survey Results

This blog focuses on the results of the second survey, specifically, the nearly half of survey participants that are companies with more than 250 employees (larger companies). Each of these companies have commercial products available in the EU.

The three areas identified by larger companies where big data will have the “greatest impact” (based on 16 choices with more than 1 answer allowed) are “patient stratification/personalized medicine,” “understanding current clinical care/patient pathway and unmet need” and “clinical trial design.” In terms of product lifecycle, the datasets cited as most applicable to these “impacted areas” (based on open text answers) are “electronic health records/electronic medical records” and “claims.”

But where are the larger companies currently using big data in drug development for “decision making”? The top 2 areas (based on an open text request for 3 examples) are “post-authorization safety surveillance and regulatory” and “target/outcomes identification.” In contrast, no larger company reported use for “patient stratification/personalized medicine,” even though it was listed as one of the areas expected to have the greatest impact.

Not surprisingly, “social media data (e.g., Twitter data)” is rated as the type of data about which respondents at larger companies have the greatest “concerns” in terms of the validity of big datasets (based on a 1–5 rating of 11 concerns). The second-highest number of respondents selected “adverse drug reaction data,” followed by “administrative claims data.” Interestingly, “proteomics,” “clinical trial data (via data sharing platforms)” and “imaging datasets (functional MRI, PET, etc.)” is cited by the lowest number of respondents as areas of concern about validity.

With a high percentage of larger companies listing it, “data access” is identified as the “key challenge” (based on a 1–5 rating of 11 challenges). The key challenge rated the second highest was “data harmonization across Europe.” The next most highly rated choices are “data privacy/legislation on data protection” and “integration of multiple datasets.” As to how regulatory networks can meet these challenges (based on open text answers), the highest number of companies list “providing data sources,” followed by “rules/regulation/guidelines” and “provide information.”

Not surprisingly, “harmonization within and between countries” is listed as the “greatest international challenge” (based on open text answers) by a high number of larger companies, but it was not the most popular choice. More respondents cited “data quality” and “access to data.”


Recommendations Designated as Top Priorities:

1) Clinical Trial and Imaging Subgroup:

  • Agree on data formats and standards for regulatory submissions of raw patient data.
  • The European regulatory network should have direct access to individual patient data during assessment of a marketing authorization.

2) Observational Data Subgroup (Electronic Health Records):

  • Mechanisms are required to drive the standardization and access to secondary care data.
  • Development of data sources in European member states which do not currently provide access to electronic health records for observational research.
  • Sustainable mechanisms for combining healthcare data across Europe should be implemented.
  • Increase the consistency of recording information on exposure to medicines including indications for use, product, dose and route, duration.
  • Increase the consistency of recording of outcomes.
  • Development of a framework to articulate for what questions and contexts real-world evidence may be acceptable across the product life cycle.

Observational Data Subgroup Recommendations (Patient Registries):

  • Harmonization of data elements, standards, terminologies and quality attributes to improve data interoperability.
  • The sharing of information between registries within a disease area should be encouraged.
  • Implement measures to increase the acceptability of registry data for regulatory decision making.

Observational Data Subgroup (Drug Utilization Databases):

  • Initiatives are required to increase access to hospital prescribing.
  • Increase the consistency of recording information on exposure to medicines including product, dose and route.
  • Increase knowledge of the availability of drug consumption data.

3) Spontaneous Adverse Drug Reaction Subgroup:

  • Evaluate new analytical tools, such as forecasting and machine learning, that leverage increased dimensions of data (spatial-temporal, other variables in case reports, meta-data).

4) Social Media and Mobile-Health Data Subgroup (m Health):

  • Facilitate the use of m-Health devices to record the efficacy and safety of medicines.

5) Genomics Subgroup:

  • Stimulate public sharing of genomics and clinical trial data.
  • Optimize data sharing and linkage of phenotypic and/or treatment parameters to genomics datasets.
  • Establish requirements regarding data quality for regulatory submissions.
  • To address the knowledge/expertise gap across the European regulatory network to ensure big data applications can be reliably assessed.
  • In fast moving scientific areas, there is a need for faster and more agile regulatory guidance.

6) Bioanalytical ‘Omics Subgroup:

  • Guidance should be provided on acceptability on Big Data sets to support regulatory decision making.
  • Clear guidance should be provided for the validation of bioanalytical methods suitable for the complexity of ‘omics’ techniques.
  • Harmonization of the used data (file) formats.
  • It is encouraged to minimize the number of data standards used.
  • To ensure appropriate assessment of regulatory submissions, expertise in various disciplines (e.g., mathematical modelling and simulation, bioinformatics and computer sciences) will be needed.