One of the most rapidly evolving areas in life science and chemical analysis is the informatics segment, and analytical data handling and knowledge management company ACD/Labs has worked to be at the forefront of changes in the industry. IBO had the opportunity to chat with Andrew Anderson, vice president of Innovation and Informatics Strategy at ACD/Labs, at Pittcon last month about how partnerships inspire innovation, the effects of AI and IoT on lab workflows, and the company’s latest product offering.
Innovating through Collaboration
ACD/Labs keeps active with collaborations with various industrial, academic, and other institutional partners, renewing its partnership with the UK’s Royal Society of Chemistry last year for the ChemSpider chemical structure database, collaborating in 2017 with educational provider Pearson to support its Master Chemistry course and, also in 2017, partnering with a biodiversity natural products organization in Brazil, to list a few. “What we aim to do is whatever we develop from a partnership perspective, we try to roll it in to our commercial applications,” said Mr. Anderson. “That way, there is sort of a consistent set of capabilities, and it does not require devoted maintenance resources and things like that. It is a win-win: for the partner, because they do not have to pay for that exclusivity, and we can get the benefit out of the partnership on our side as well.”
Collaborating with organizations and customers plays a key role in the strategy of the company. “We maintain an active customer success program, where we work very closely with our existing customers to rank and order what they are asking for; we have a devoted development organization just working on customer-success requests,” Mr. Anderson stated. “It is about 20% of our overall R&D investment, supporting current customers with new features and functionality. We maintain an active and spirited development branch for our existing products, so for folks that invest in ACD/Labs, that is a really rewarding experience because they get to see a lot of things they are asking for implemented.”
ACD/Labs is also a partner of the Allotropes Foundation, an international consortium of drug makers that works on standardizing and contextualizing experimental data to streamline the collecting, managing and analyzing of data (see New Progress for Lab Instrument Data Standardization for more information on the Foundation). The company works with both other partners of the Allotrope Foundation as well as members. “We are developing tool kits, if you will, that will ultimately allow for [Allotrope’s] standard to proliferate across the different organizations,” explained Mr. Anderson. “We can actually generate the Allotrope formatted data from our system, and we can also consume data if it is written in the Allotrope format. We are enthusiastic about Allotrope because it really helps establish visibility for analytical data and the need to utilize analytical data in decision-making. We help customers with that mission, so an organization like Allotrope is really important to ACD/Labs.”
“Without our partners that are innovating on the technique side, we do not have the ability to innovate, to help our customers make more informed and valuable decisions within their organizations.”
ACD/Labs’ partnerships help facilitate and accelerate its customers’ research. As Mr. Anderson elucidated, “We are not a data source—in a lot of ways, we [provide] a decision support interface for our customers. People that buy and license our software, they will use our applications to make decisions about things, whether they are scientists that need to look at analytical data very tediously, or they are a development project team member [who] needs to compare different lots of drug substance or drug product in their supply chain. Analytical data are the eyes to your organization.”
Through actively working with customers and partners, the company is able to develop applications that are targeted and specific. “Our applications are purpose built, so in a lot of ways, the lifeblood of ACD/Labs is the data sources we take in,” said Mr. Anderson. “A lot of the alternative ways that people build applications to perform this decision support [include hacking] the file formats—they will literally take a binary file format and reverse engineer it, and then support that format,” he continued. “We do not take that approach. We work actively with [companies such as] Waters, Agilent, Thermo Fisher Scientific, SCIEX [and] Shimadzu. Without that, we can’t innovate.”
As an example, Mr. Anderson referred to the evolution of MS over the past 15 years as an invaluable tool for molecular analysis in fields such as chemistry and biology. “If you look at what is going on right now in biology, [such as] characterizing things like monoclonal antibodies or antibody drug conjugates, at the forefront of that is MS,” he said. “The formats of that type of data are much more complex than, say, a single quadrupole MS. It is MSn—an n-dimensional data structure. Without our partners that are innovating on the technique side, we do not have the ability to innovate, to help our customers make more informed and valuable decisions within their organizations. That is why partnerships are so important to us.”
Data on Cloud Nine
Data structure is key to informatics, as is storing the data in a way that is convenient and accessible. To address this, companies are increasingly providing cloud-based offerings, and although this is extremely helpful in streamlining workflows and offering centralized data solutions, it is not without its challenges. “If you look at the intent of the cloud, the intent of distributing data, it is two-fold,” Mr. Anderson said. “It makes proliferation of data easier, but it also lowers your infrastructure cost.” The cloud, he explained, allows for greater flexibility and also provides a financial incentive, as it will not depreciate and require to be replaced the same way fixed-cost investments do. “The challenge is, you have to get the data [to the cloud],” he added.
“Our applications are purpose built, so in a lot of ways, the lifeblood of ACD/Labs is the data sources we take in.”
As an example, Mr. Anderson pointed to medical imaging, as MRI files are gigabytes large. “If you are running a patient clinic and you want to put that data onto the cloud, you could probably generate a terabyte’s worth of data in just one day. So do you have enough bandwidth to get that data to the cloud?” he explained. “Another example is Shimadzu’s Q-TOF—it is a quadrupole MS format. If you are doing positive ion/negative ion mode scanning MS data, that is about a gigabyte per experiment, per injection,” he continued. “How are you going to get that from the hard disk to a data center, which is hard enough, let alone from a hard disk, outside of your firewall, then to AWS [Amazon Web Services]? Therein lies one of the fundamental challenges of the cloud, and I think we all need to think about how to best create the right user experience.”
The difference lies in the type of data. Streaming data, as Mr. Anderson explained, is easy to get to the cloud, as the data are small bits per unit of time. But analytical information, in which the data sizes are much larger, poses a greater challenge in streaming. “That contrast in IoT data is the real challenge,” he said. “Complex IoT, I think, is the hardest thing for the cloud to support, but I think there are some tricks that folks are using. I think 5G is going to be helpful with streaming data as well, so let’s cross our fingers that we can move a gigabyte’s worth of data in six minutes or less. There’s the tagline, right? ‘A gigabyte in six minutes or less!’”
Meeting this challenge is a key goal from ACD/Labs’ perspective because the company has many cloud-enabled products. As Mr. Anderson explained, customers want their own cloud environments in which ACD/Labs can implement its solutions—that way, the customers can control and access the data. “We haven’t yet offered a hosted environment for our solutions because customers typically want more control,” he stated. “That may change over time, and we are willing to take on that challenge when it presents itself. So far, our pharma and industrial customers definitely want control over the environment.”
AI in the Lab
With reproducibility posing as a key issue in the sciences, AI is emerging as a significant player that could help regulate and thus streamline workflows. As Mr. Anderson noted, AI can help with numerous issues, including chemical design. “When you want to make something, the first part that AI would play a role in is helping you define what you want to make,” he explained. “So in pharma and drug discovery, I’m changing a chemical structure to have a more impactful effect on the body (potency, selectivity, low toxicity, etc.). AI will help with the optimization of the molecule.”
“There’s the tagline, right? ‘A gigabyte in six minutes or less!’”
AI will also be able to support the design and execution of an experiment. “The second step is, ‘Ok, now I have something to make—how am I going to make it?’” Mr. Anderson continued. “There are some concerted efforts around retrosynthetic analysis to define how to make things more effective. If you use AI tools, you can reduce [lab] costs quite dramatically by picking a route that a human would not have thought of. You get a greater yield and you may not have to use some of the heavy metals that are toxic and are potentially contaminants, so it streamlines and accelerates the whole process of going from ‘I need to secure a disease or infection’ to ‘I have material in hand—let’s go test my hypothesis.’”
Another area Mr. Anderson predicts AI will be impactful is in designing scale processes more efficiently. “[It will support users who think,] ‘I know how to make something at a gram scale—how do I make it at a metric ton scale? That scale-up process optimization is an area that we are very excited about,” he added.
Katalysts for Change
To address many of these concerns, last month, ACD/Labs launched Katalyst D2D, a web-based application, for high-throughput experimentation, including reaction optimization, process development, catalyst screening and scale-up. According to Mr. Anderson, Katalyst D2D is the company’s first web-based application, which makes it much easier to deploy and also allows customers the freedom to tweak layouts in the application. “One of the things that’s a continuous challenge is that administrators want a wealth of functionality, but still want simple user interfaces,” he explained.
“One user will want one thing, one user will want another… it’s like art appreciation.”
To illustrate his point, he referenced the theory of semiotics, which studies the relationship between a sign and its symbolic meanings through a signifier (an object that signifies), the signified (the meaning or concept the signifier refers to) and the referent (the concrete object to which the sign refers). “In software, where you get into trouble from the usability perspective is when your signifier is not a good indication of what the function is, what the affordance is,” he continued. “So I’m really mindful, as the head of Innovation here, that whenever we design software interfaces [at ACD/Labs] that they’re intuitive and easy to use.”
Part of making the application user-friendly is allowing users customization capabilities, in which they can be more in control of the signifiers they desire. “What we’ve done with the Katalyst application is the layouts are user-configurable, so if we don’t get it right, we show users—any user—how to build their own layouts, how to build their own workflows,” Mr. Anderson said. This makes Katalyst D2D inherently user-friendly, as the application can be modified to the user’s needs. “One user will want one thing, one user will want another… it’s like art appreciation,” he added. “Sometimes people want diametrically opposed features and functions, so we try to empower users to do what they want.”
Katalyst D2D is built to address the rapidly changing informatics landscape, especially in regards to data structures. As Mr. Anderson explicated, traditionally, data have been stored in centralized repositories known as data lakes, which are relational databases from which users can take data that may not be structured and place them in the repository. The data are sufficiently tagged, he explained, to where users can have specialized applications that allow for rapid querying, making the data easily retrievable. Katalyst D2D, however, has an additional capability, as it can support non-relational data structures. “With non-relational data structures, you have a very monolithic table, but lots of different types of data that are tagged in a way that we can go to any record in that table, find the appropriate data and pull it out,” he stated.
“That is our big investment: helping people engineer data so they can present it.”
Katalyst D2D also offers support for data engineering. “We generate structured data that can be presented to machine learning applications,” said Mr. Anderson. “If you have ever heard anyone talk about AI, they say that 75% of the time that is spent on AI is on data engineering. In a way, we are automating that data engineering process.” This process, as Mr. Anderson explained, traditionally involves users deciding on how they take their data and structure it in a way that allows for neural networks to find paths, and for deep learning applications to find the different levels and connection points. Because of this, data structure can take a significant amount of time and effort. “With Katalyst, we take that time down to zero,” he continued. “You have this fully characterized dataset that is already structured, so you are literally just pushing that formatted data to your machine learning applications.”
In a way, as Mr. Anderson posited, it is like making an investment. “I will run these experiments. I will have a digital description of the design of the experiment, the execution of the experiment and the results of the characterization,” he said. “It is all consolidated into one data structure, so there is no data engineering required—you are just sending that off. That is our big investment: helping people engineer data so they can present it.”
Along with data structures, another trend in the industry for ACD/Labs, Mr. Anderson noted, is agile implementation. “Currently we deploy a new version of software once a year—so we have a new version in the summer and an update in the winter,” he said. “If you look at the rate of development versus the date of release, we can develop a lot faster than we can release. So what we are really trying to migrate to is a continuous release schedule, where as soon as something is released, it is available for customers to link up.”
Looking ahead, ACD/Labs is continuing to focus on providing the most intuitive and comprehensive customer experience it can. “I think for us, as a tools provider, [the goal is] to help facilitate those connections on scale up, on synthesis prediction and, ultimately, on target discovery and target selection,” Mr. Anderson said. “Those are the areas in which we want to help our customers.”