LabKey Software
LabKey Software approaches instrument data storage, sharing and analysis by way of web-based remote access. The company was formed in 2005 through a partnership between software developers and the Fred Hutchinson Cancer Research Center (FHCRC), after the FHCRC decided to develop a way to store and share large volumes of instrument data based on an open-source application. Over time, the company has developed data repository, proteomics, flow cytometry and observational data handling capabilities within its online framework, the LabKey Server. The LabKey Server is a web application on top of an SQL database, programmed in Java. The company also fosters a wiki-based community for the development of modules used within the LabKey server. LabKey’s revenues are based on professional services contracts to implement large-scale research projects on the LabKey Server. A large portion of the company’s revenues come from two international research consortia of labs working on an HIV/AIDS vaccine.
Because the Server is open-source and Internet based, LabKey can be a facilitator in dealing with two large-scale instrument data handling problems: ontology, which is a standardized nomenclature of scientific terms used for labeling data; and the use of a specific file format. Dr. Kevin Banks, LabKey’s director of Marketing, described how experiment ontologies can be both integrated by the Server and tailored by researchers: “[The Server] provides a number of services that facilitate the creation, population, query and migration of those module-specific tables. LabKey Server also provides a ‘property store’ mechanism that allows a module to create lightweight data sets that can be queried and joined just like standard SQL tables, but are more easily created and dropped as part of user workflows.” As a result, the ontology is based on the module that is used.
As for file formats, although a number of LabKey Server programs have been developed specifically for file conversion, the inconvenience of having to convert remains. Nevertheless, tab-separated text files and Excel spreadsheets are often the standard file format for Server modules. “We’ve also seen that de facto standards often win over committee-designed standard formats,” said Dr. Banks.
LabKey has created various open-source pipelines (or workflows) for applications in proteomics and flow cytometry by tactically grouping various modules and third-party software. For example, the LabKey Flow workflow can be used as a web-based flow cytometric analysis system. Flow cytometry data from intracellular cytokine staining assays done in 96-well plates is used for the determination of T-cell immunogenicity in HIV research. Data can be stored in the Server in the Flow Cytometry Standard (FCS) format. FCS files include fluorescence data for each well, as well as any sample-identifying keywords entered at the time of collection. Further analysis, including gating and calculating event counts, frequencies, mean, median and standard deviations, and percentiles for parameters can also be performed. Another workflow, the Computational Proteomics Analysis System for MS data, uses integrated third-party programs such as PeptideProphet and ProteinProphet, for data analysis of mzXML or pep XML files.