Government

Earlier this month, the NIH announced the development of a “Strategic Plan for Data Science” for establishing its objectives and implementation methods for the modernization of data science, defined by the NIH as “the interdisciplinary field of inquiry in which quantitative and analytical approaches, processes and systems are developed and used to extract knowledge and insights from increasingly large and/or complex sets of data.” Genomics data in particular is forecast to grow exponentially, equaling or exceeding data from astronomy, YouTube and Twitter, the three major data producers. Moreover, advances in computer technology are aiming for exascale-level computing, which computes a quintillion, or 1018, calculations per second. These supercomputers are thought to open new doors for biomedical research, which will be the main driver of the advanced technology due to its data-intensive nature. Programs likely to adopt exascale-level computing include the All of Us Research Program and the Cancer Moonshot program of the Precision Medicine Initiative, as well as the Human Connectome project and the BRAIN initiative.

In order to manage the vast amounts of data needed to advance biomedical research, the NIH posited in its Strategic Plan five overarching goals: the need for a common software-as-a-service data infrastructure that individual researchers, institutions and scientific communities will be able to access and build upon; a modernized data ecosystem with improved integration of clinical and observational data into biomedical data science, as well as enhanced storage and sharing of datasets; better data management, analytics and tools to drive efficiency and more productive workflows, including improved discovery and cataloguing resources; an expanded NIH data science workforce; and the establishment of policies for a data system that is Findable, Accessible, Interoperable and Reusable (FAIR).

Source: NIH

< | >