Earlier this week, the Allotrope Foundation, an international consortium of drug makers, announced an achievement in its aim to develop a universal format for scientific data, releasing the Allotrope Data Model (ADM), the third component of the Allotrope Framework.
The Allotrope Framework is designed to standardize and contextualize experimental data at the point of creation, enabling a new chapter in the collection, management and analysis of scientific data. The Allotrope Partner Network (APN) consists of instrument vendors and software firms that are collaborating on the design and implementation of the Framework with their customers and the incorporation of it into their commercial products. The ADM builds upon the earlier releases of the Allotrope Data Format (ADF) and the Allotrope Foundation Ontologies (AFO).
- ADF: a family of specifications designed to standardize the acquisition, exchange, storage and access of analytical data captured in laboratory workflows
- AFO: an ontology suite that provides a standard vocabulary and semantic model for the representation of laboratory analytical processes
- ADM: a mechanism to define data structures (schemas, templates) that describe how to use the ontologies for a given purpose in a standardized (i.e., reproducible, predictable, verifiable) way
According to Dana Vanderwall, PhD, chair of the Allotrope Board of Directors, “The ADF is the container in which we store the data, using the software libraries Allotrope provides to implement the standard. It can store different kinds of data, not only the data that was actually acquired off the instrument but also the semantic layer, what we call the data description, for storing the more descriptive context.” The AFO makes the data and context available through a standard vocabulary that can be understood and utilized by all users, not just humans, but machines as well. “So the vocabulary that we use to do science and communicate with one another is just as important as standardizing the container in which we put it, so that we literally know we’re talking about the same thing, in the same context,” he said. The ADM is the structure in which to use this ontology, making it usable to software developers. As he explained, “You need some way to structure the words to actually create the scientific equivalent of the language, because you could use the same words in different ways or somebody could decide to describe something in different levels of detail.” The ADM is essentially a “library of reusable modeling patterns,” said Vincent Antonucci, PhD, Business Product Owner of the ADM Product and vice chair, Allotrope Board of Directors.
Both the ADF and ADM are available under an Allotrope license. However, the AFO is publicly available under a Creative Commons license. “[This is] partly because there is so much work being done in our domain around ontology, and we feel like we’re filling in some really critical gaps in the experimental base for ontologies that we want to share broadly for the benefit of the field as a whole,” noted Dr. Vanderwall.
The Framework and HPLC-UV
Included in the newly released ADM is an updated version of the AFO for HPLC-UV. HPLC-UV is the first instrument technique to be addressed by Allotrope with a detailed ‘graph’ model. Commercial products for the ADF are already available. Last summer, Agilent Technologies released the OpenLab CDS [Chromatography Data System] ChemStation Edition, incorporating Alltrope’s ADF and AFO. Although, due to the release’s timing, this release does not include the ADM, Agilent will release versions of ChemStation and OpenLab CDS that include the ADM as well. The first release is an important step, said Dr. Antonucci. “Just by loading the information using the first two products into the Allotrope container, so you can begin to contextualize the data and export it in Allotrope format. This demonstrates that you can use parts of the Allotrope solution as they become available as part of a phased implementation plan,” he said.
Zontal, which provides software for data archiving, has also released a commercial product incorporating components of the Allotrope Framework. Waters, another CDS provider, announced a partnership in December 2018 with TetraScience to convert Waters’ Empower 3 Software data to the ADF as part of an Allotrope Community Project. Both Agilent and Waters are part of the APN, giving them access to the Framework. “The Allotrope Framework components, including the data model and the ontologies we built specifically for HPLC, are designed to be vendor neutral,” explained Dr. Antonucci. Once these products become part of a vendor’s for-profit software, the vendor is required to have an Allotrope license, which consists of an annual user fee.
The HPLC-UV ADM also serves as a base for future Allotrope data modeling efforts addressing other analytical techniques, enabling faster development. “Each technique uses many of the same principles to structure data, providing the opportunity to assemble a library of reusable modeling patterns. Think of it like Lego blocks you can click together in different ways to model different things, like a new instrument or a new domain,” explained Mr. Antonucci. “But you do need a different schema for each type of instrument because the inputs and outputs are different and the parameters you control are different; those schema are comprised of many of the reusable units.” Planning to address other analytical techniques is underway, according to him.
Instrument vendors are already working with Allotrope’s pharmaceutical and biotech company members, which have all implemented parts of the Framework to varying degrees as proofs of concept. “That allowed the companies to move forward,” noted Dr. Vanderwall. “It also gave the vendor an environment, a clear use case and an interested party to learn the technology and learn how to implement it.”
This year, Allotrope is focused on driving adoption of the Framework, which is being guided by a Foundation roadmap. “What can we do to make the technical path smoother?,” is one question the roadmap addresses, according to Dr. Vanderwall. “The other is a strategic-level business analysis across our member companies to paint a more detailed, longer-term picture of what our priorities are for the areas of our business and the kinds of instruments that we would want to see enabled sooner rather than later.” Dr. Antonucci told IBO, “We think [the Roadmap] will lead to the type of Allotrope ecosystem that we’re trying to create, which enables us to freely have well-structured and contextualized scientific data that allows to essentially interrogate it however we want.”
Also on Allotrope’s immediate agenda is addressing the growing development community. “There is the core development that Allotrope is funding to build and maintain the fundamental elements of the Framework, but then there are also communities, companies and working groups working together to develop other models to expand the instruments and workflows covered, based on the patterns and the models Allotrope has built to date,” emphasized Dr. Vanderwall. “We’re basically starting to clone the effort. We’re actually shifting a little more of our resources to the governance process, like a SDLC-type (Software Development Life Cycle) plan for the software components, data models and ontology, so that we can handle the intake from these other groups that are building models and ontologies, or extending the software libraries.”
Allotrope will hold two workshops this year on the Framework. This spring, a public workshop will be held in Velizy, France, on April 9, hosted by Dassault. This fall, a second workshop will be held in San Francisco, California, on October 8, hosted by Genentech. For more information, contact email@example.com.
Analytical Instrument Vendors That Are Allotrope Network Partners:
- Agilent Technologies
- Bio-Rad Laboratories
- Leap Technologies
- Thermo Fisher Scientific
- Unchained Labs